netEst.undir | R Documentation |
Estimates a sparse inverse covariance matrix using a lasso (L1) penalty.
netEst.undir(x, zero = NULL, one = NULL, lambda, rho=NULL,
penalize_diag = TRUE, weight = NULL,
eta = 0, verbose = FALSE, eps = 1e-08)
x |
The |
zero |
(Optional) indices of entries of the weighted adjacency matrix to be constrained to be zero. The input should be a matrix of |
one |
(Optional) indices of entries of the weighted adjacency matrix to be kept regardless of the regularization parameter for lasso. The input is similar to that of |
lambda |
(Non-negative) numeric vector representing the regularization parameters for lasso. Can choose best based on BIC using |
rho |
(Non-negative) numeric scalar or symmetric |
penalize_diag |
Logical. Whether or not to penalize diagonal entries when estimating weighted adjacency matrix. If TRUE a small penalty is used, otherwise no penalty is used. |
weight |
(Optional) whether to add penalty to known edges. If NULL (default), then the known edges are assumed to be true. If nonzero, then a penalty equal to lambda * weight is added to penalize the known edges to account for possible uncertainty. Only non-negative values are accepted for the weight parameter. |
eta |
(Non-negative) a small constant added to the diagonal of the empirical covariance matrix of |
verbose |
Whether to print out information as estimation proceeds. Default = |
eps |
(Non-negative) numeric scalar indicating the tolerance level for differentiating zero and non-zero edges: entries with magnitude |
The function netEst.undir
performs constrained estimation of sparse inverse covariance (concentration) matrices using a lasso (L1) penalty, as described in Ma, Shojaie and Michailidis (2016). Two sets of constraints determine subsets of entries of the inverse covariance matrix that should be exactly zero (the option zero
argument), or should take non-zero values (option one
argument). The remaining entries will be estimated from data.
The arguments one
and/or zero
can come from external knowledge on the 0-1 structure of underlying concentration matrix, such as a list of edges and/or non-edges learned from available databases.
netEst.undir
estimates both the support (0-1 structure) of the concentration matrix, or equivalently, the adjacency matrix of the corresponding Gaussian graphical model, for a given tuning parameter, lambda
; and the concentration matrix with diagonal entries set to 0, or equivalently, the weighted adjacency matrix.
The weighted adjacency matrix is estimated using maximum likelihood based on the estimated support. The parameter rho
controls the amount of regularization used in the maximum likelihood step. A small rho
is recommended, as a large value of rho
may result in too much regularization in the maximum likelihood estimation, thus further penalizing the support of the weighted adjacency matrix.
Note this function is suitable only for estimating the adjacency matrix of a undirected graph. The weight
parameter allows one to specify whether to penalize the known edges. If known edges obtained from external information contain uncertainty such that some of them are spurious, then it is recommended to use a small positive weight
parameter to select the most probable edges from the collection of known ones.
This function is closely related to NetGSA
, which requires the weighted adjacency matrix as input. When the user does not have complete information on the weighted adjacency matrix, but has data (x
, not necessarily the same as the x
in NetGSA
) and external information (one
and/or zero
) on the adjacency matrix, then netEst.undir
can be used to estimate the remaining interactions in the adjacency matrix using the data.
Further, when it is anticipated that the adjacency matrices under different conditions are different, and data from different conditions are available, the user needs to run netEst.undir
separately to obtain estimates of the adjacency matrices under each condition.
The algorithm used in netEst.undir
is based on glmnet
and glasso
. Please refer to glmnet
and glasso
for computational details.
A list with components
Adj |
List of weighted adjacency matrices (partial correlations) of dimension |
invcov |
List of estimated inverse covariance matrix of dimension |
lambda |
List of values of tuning parameters used. |
Jing Ma & Michael Hellstern
Ma, J., Shojaie, A. & Michailidis, G. (2016) Network-based pathway enrichment analysis with incomplete network information. Bioinformatics 32(20):165–3174. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btw410")}
prepareAdjMat
, bic.netEst.undir
, glmnet
library(glassoFast)
library(graphite)
library(igraph)
set.seed(1)
## load the data
data(breastcancer2012_subset)
## consider genes from the "Estrogen signaling pathway" and "Jak-STAT signaling pathway"
genenames <- unique(c(pathways[[25]], pathways[[52]]))
sx <- x[match(genenames, rownames(x)),]
if (sum(is.na(rownames(sx)))>0){
sx <- sx[-which(is.na(rownames(sx))),]
}
p <- length(genenames)
## zero/one matrices should be based on known non-edges/known edges. Random used as an example
one <- matrix(sample(c(0,1), length(rownames(sx))**2,
replace = TRUE, prob = c(0.9, 0.1)), length(rownames(sx)),
dimnames = list(rownames(sx), rownames(sx)))
ncond <- length(unique(group))
Amat <- vector("list",ncond)
for (k in 1:ncond){
data_c <- sx[,(group==k)]
fitBIC <- bic.netEst.undir(data_c,one=one,
lambda=seq(1,10)*sqrt(log(p)/ncol(data_c)),eta=0.1)
fit <- netEst.undir(data_c,one=one,
lambda=which.min(fitBIC$BIC)*sqrt(log(p)/ncol(data_c)),eta=0.1)
Amat[[k]] <- fit$Adj
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.