BMN.logistic: Estimation of a binary Markov network using nodewise logistic...
In jingmafdu/TestBMN: Testing for binary Markov networks

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/BMN.logistic.R

This function aims to estimate the partial correlation matrix associated with a binary Markov network using the nodewise logistic regression approach proposed by Ravikumar et al. (2010).

1	BMN.logistic(X, lambda, gamma = 0.25, bic = TRUE, verbose = FALSE, eps = 1e-08)

`X`	The n x p data matrix.
`lambda`	A vector of tuning parameters. The length of `lambda` should be p.
`gamma`	A tuning parameter required in evaluating the extended version of Bayesian information criterion (EBIC), which can be any constant between 0 and 1. Default is 0.25. See ‘Details’.
`bic`	Whether to compute the EBIC. Default is TRUE.
`verbose`	Whether to print out intermediate iterations for every nodewise regression. Default is FALSE.
`eps`	Numeric scalar >=0, indicating the tolerance level for differentiating zero and non-zero edges: entries < `eps` will be set to 0.

The function BMN.logistic fits p \ell_1-regularized logistic regressions to the data X to recover the partial correlation matrix of the binary Markov network. Internally, the function glmnet is called for each node j=1,…,p using the j-th column of data X as the response and all remaining variables as the predictors to estimate the neighborhood of node j. The j-th component of lambda is used as the penalization parameter for the j-th logisitc regression. Finally, the results from p regressions are aggregated to obtain the symmetric partial correlation matrix.

Model selection for each of the p regressions in BMN.logistic is done by minimizing the EBIC (Barber and Darton, 2015), where the additional parameter gamma corresponds to some prior belif on the set of considered models. For details, please refer to Barber and Darton (2015) and Zak-Szatkowska and Bogdan (2011).

`theta`	The partial correlation matrix (p x p) of the binary Markov network.
`adj`	The adjacency matrix (p x p) of the binary Markov network.
`EBIC`	The extended version of Bayesian information criterion (1 x p) for model selection.
`lambda`	The tuning parameter used (1 x p).

Jing Ma

Ravikumar, P., Wainwright, M. J., & Lafferty, J. D. (2010). High-dimensional Ising model selection using l1-regularized logistic regression. The Annals of Statistics, 38(3), 1287-1319.

Zak-Szatkowska, M., & Bogdan, M. (2011). Modified versions of the Bayesian information criterion for sparse generalized linear models. Computational Statistics & Data Analysis, 55(11), 2908-2924.

Barber, R. F., & Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. Electronic Journal of Statistics, 9(1), 567-607.

BMN::BMNPseudo, glmnet

library(glmnet)

set.seed(1)

p = 50    # number of variables
n = 100   # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5  # signal strength 
rho_low = 0.1   # signal strength 
eps = 8/n       # tolerance for extreme proportioned observations
q = (p*(p - 1))/2

##---(1) Generate the network  
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))

##---(2) Generate the Theta  
weights = matrix(0, p, p)
upperTriangle(weights) = runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta = weights * Amat
dat = BMN.samples(Theta, n, n0, skip=1)
tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
while(min(tmp)<eps || abs(1-max(tmp)<eps)){
  dat = BMN.samples(Theta, n, n0, skip=10)
  tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
}

lambda = rep(0.1, p)
fit = BMN.logistic(dat, lambda)