BMN.logistic: Estimation of a binary Markov network using nodewise logistic...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/BMN.logistic.R

Description

This function aims to estimate the partial correlation matrix associated with a binary Markov network using the nodewise logistic regression approach proposed by Ravikumar et al. (2010).

Usage

1
BMN.logistic(X, lambda, gamma = 0.25, bic = TRUE, verbose = FALSE, eps = 1e-08)

Arguments

X

The n x p data matrix.

lambda

A vector of tuning parameters. The length of lambda should be p.

gamma

A tuning parameter required in evaluating the extended version of Bayesian information criterion (EBIC), which can be any constant between 0 and 1. Default is 0.25. See ‘Details’.

bic

Whether to compute the EBIC. Default is TRUE.

verbose

Whether to print out intermediate iterations for every nodewise regression. Default is FALSE.

eps

Numeric scalar >=0, indicating the tolerance level for differentiating zero and non-zero edges: entries < eps will be set to 0.

Details

The function BMN.logistic fits p \ell_1-regularized logistic regressions to the data X to recover the partial correlation matrix of the binary Markov network. Internally, the function glmnet is called for each node j=1,…,p using the j-th column of data X as the response and all remaining variables as the predictors to estimate the neighborhood of node j. The j-th component of lambda is used as the penalization parameter for the j-th logisitc regression. Finally, the results from p regressions are aggregated to obtain the symmetric partial correlation matrix.

Model selection for each of the p regressions in BMN.logistic is done by minimizing the EBIC (Barber and Darton, 2015), where the additional parameter gamma corresponds to some prior belif on the set of considered models. For details, please refer to Barber and Darton (2015) and Zak-Szatkowska and Bogdan (2011).

Value

theta

The partial correlation matrix (p x p) of the binary Markov network.

adj

The adjacency matrix (p x p) of the binary Markov network.

EBIC

The extended version of Bayesian information criterion (1 x p) for model selection.

lambda

The tuning parameter used (1 x p).

Author(s)

Jing Ma

References

Ravikumar, P., Wainwright, M. J., & Lafferty, J. D. (2010). High-dimensional Ising model selection using l1-regularized logistic regression. The Annals of Statistics, 38(3), 1287-1319.

Zak-Szatkowska, M., & Bogdan, M. (2011). Modified versions of the Bayesian information criterion for sparse generalized linear models. Computational Statistics & Data Analysis, 55(11), 2908-2924.

Barber, R. F., & Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. Electronic Journal of Statistics, 9(1), 567-607.

See Also

BMN::BMNPseudo, glmnet

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
library(glmnet)

set.seed(1)

p = 50    # number of variables
n = 100   # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5  # signal strength 
rho_low = 0.1   # signal strength 
eps = 8/n       # tolerance for extreme proportioned observations
q = (p*(p - 1))/2

##---(1) Generate the network  
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))

##---(2) Generate the Theta  
weights = matrix(0, p, p)
upperTriangle(weights) = runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta = weights * Amat
dat = BMN.samples(Theta, n, n0, skip=1)
tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
while(min(tmp)<eps || abs(1-max(tmp)<eps)){
  dat = BMN.samples(Theta, n, n0, skip=10)
  tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
}

lambda = rep(0.1, p)
fit = BMN.logistic(dat, lambda)

jingmafdu/TestBMN documentation built on Feb. 20, 2022, 5:24 p.m.