Description Usage Arguments Details Value Author(s) References See Also Examples
This function aims to estimate the partial correlation matrix associated with a binary Markov network using the nodewise logistic regression approach proposed by Ravikumar et al. (2010).
1 | BMN.logistic(X, lambda, gamma = 0.25, bic = TRUE, verbose = FALSE, eps = 1e-08)
|
X |
The n x p data matrix. |
lambda |
A vector of tuning parameters. The length of |
gamma |
A tuning parameter required in evaluating the extended version of Bayesian information criterion (EBIC), which can be any constant between 0 and 1. Default is 0.25. See ‘Details’. |
bic |
Whether to compute the EBIC. Default is TRUE. |
verbose |
Whether to print out intermediate iterations for every nodewise regression. Default is FALSE. |
eps |
Numeric scalar >=0, indicating the tolerance level for differentiating zero and non-zero edges: entries < |
The function BMN.logistic
fits p \ell_1-regularized logistic regressions to the data X to recover the partial correlation matrix of the binary Markov network. Internally, the function glmnet
is called for each node j=1,…,p using the j-th column of data X as the response and all remaining variables as the predictors to estimate the neighborhood of node j. The j-th component of lambda
is used as the penalization parameter for the j-th logisitc regression. Finally, the results from p regressions are aggregated to obtain the symmetric partial correlation matrix.
Model selection for each of the p regressions in BMN.logistic
is done by minimizing the EBIC (Barber and Darton, 2015), where the additional parameter gamma
corresponds to some prior belif on the set of considered models. For details, please refer to Barber and Darton (2015) and Zak-Szatkowska and Bogdan (2011).
theta |
The partial correlation matrix (p x p) of the binary Markov network. |
adj |
The adjacency matrix (p x p) of the binary Markov network. |
EBIC |
The extended version of Bayesian information criterion (1 x p) for model selection. |
lambda |
The tuning parameter used (1 x p). |
Jing Ma
Ravikumar, P., Wainwright, M. J., & Lafferty, J. D. (2010). High-dimensional Ising model selection using l1-regularized logistic regression. The Annals of Statistics, 38(3), 1287-1319.
Zak-Szatkowska, M., & Bogdan, M. (2011). Modified versions of the Bayesian information criterion for sparse generalized linear models. Computational Statistics & Data Analysis, 55(11), 2908-2924.
Barber, R. F., & Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. Electronic Journal of Statistics, 9(1), 567-607.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | library(glmnet)
set.seed(1)
p = 50 # number of variables
n = 100 # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5 # signal strength
rho_low = 0.1 # signal strength
eps = 8/n # tolerance for extreme proportioned observations
q = (p*(p - 1))/2
##---(1) Generate the network
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))
##---(2) Generate the Theta
weights = matrix(0, p, p)
upperTriangle(weights) = runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta = weights * Amat
dat = BMN.samples(Theta, n, n0, skip=1)
tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
while(min(tmp)<eps || abs(1-max(tmp)<eps)){
dat = BMN.samples(Theta, n, n0, skip=10)
tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
}
lambda = rep(0.1, p)
fit = BMN.logistic(dat, lambda)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.