Description Usage Arguments Details Value References See Also Examples

View source: R/covglasso.R View source: R/RcppExports.R

Direct estimation of a sparse covariance matrix using the covariance graphical lasso.

1 2 3 4 5 6 7 8 9 10 11 12 |

`data` |
A numerical dataframe or matrix, where rows correspond to observations and columns to variables. If |

`S` |
The sample covariance matrix of the data. If |

`n` |
The number of observations. If |

`lambda` |
A vector or array of non-negative lasso regularization parameters. Penalization is applied elementwise to all entries of the covariance matrix. If an array, each entry must be a matrix with same dimensions of the sample covariance matrix. Values should be increasing from the smallest to the largest. If |

`rho` |
A vector of correlation values used to define the penalization in terms of the thresholded sample correlation matrix. See "Details". Note that this penalization is used by default. |

`duplicated` |
Remove duplicated penalty matrices when the default penalty term based on the thresholded correlation matrix is used. Suggest to leave this argument to |

`L` |
The number of |

`crit` |
The model selection criterion employed to select the optimal covariance graph model. Can be |

`gamma` |
A penalty parameter used when |

`penalize.diag` |
A logical argument indicating if the diagonal of the covariance matrix should be penalized. Default to |

`start` |
A starting matrix for the estimation algorithm. If |

`ctrl` |
A list of control parameters for the coordinate descent algorithm employed for estimation. See also |

`path` |
A logical argument controlling whether all the estimated covariance matrices along the path defined by |

The function estimates a sparse covariance matrix using a fast coordinate descent algorithm to solve the covariance graphical lasso. The estimated sparse covariance matrix is obtained by optimizing the following penalized log-likelihood:

*-n/2*( logdet(Sigma) + trace(S %*% Sigma^-1) ) - ||lambda*Sigma||_1*

subject to *Sigma* being positive definite. In the penalty term, the *L1* norm and the matrix multiplication between *lambda* and *Sigma* is elementwise.

By default (when `lambda = NULL`

), the penalization matrix *lambda* is defined in terms of a sequential thresholding of the sample correlation matrix. Given *rho[l]* a threshold value and *R* the sample correlation matrix, the penalty term matrix *lambda* is defined by the values *ifelse(r[i,j] < rho[l], 1/s[i,j], 0)*, that is:

*lambda = ifelse(R < rho[l], 1/S, 0)*

where the inequality is taken elementwise. Such choice of penalty matrix provides a framework related to the adaptive lasso of Fan et al. (2009) and the method of Chaudhuri et al. (2007).
If the vector `rho`

is not given in input, the sequence of threshold values is defined as the `L`

quantiles of the absolute values of the sample correlations in *R*. If `lambda`

is provided in input, the penalization corresponds to the standard covariance graphical lasso of Bien, Tibshirani (2011).

The sparse covariance matrix corresponds to a Gaussian covariance graphical model of marginal independence, where in the sparse covariance matrix a zero entry corresponds to two variables being marginally independent. Different penalizations `lambda`

imply different models, and selection of the optimal graphical model is performed using `"bic"`

(default) or `"ebic"`

. In the latter case, the argument `gamma`

controls the additional penalty term in the model selection criterion; see Foygel, Drton, (2010).

A list containing the following elements.

`sigma` |
The estimated covariance matrix. |

`omega` |
The estimated concentration (inverse covariance) matrix. |

`graph` |
The adjacency matrix given in input corresponding to the marginal or conditional independence graph. |

`loglik` |
Value of the maximized log-likelihood. |

`npar` |
Number of estimated non-zero parameters. |

`penalty` |
Value of the penalty term. |

`bic` |
Optimal BIC or EBIC value. |

`BIC` |
All BIC or EBIC values along the path defined by |

`path` |
A list containing all the estimated sparse covariance models. Provided in output only when |

`rho` |
The values of |

`lambda` |
The values of |

Bien, J., Tibshirani, R.J. (2011). Sparse estimation of a covariance matrix. *Biometrika*, 98(4), 807–820.

Chaudhuri, S., Drton M., Richardson, T. S. (2007). Estimation of a covariance matrix with zeros. *Biometrika*, 94(1), 199-216.

Fan, J., Feng, Y., Wu, Y. (2009). Network exploration via the adaptive lasso and scad penalties. *The Annals of Applied Statistics*, 3(2), 521.

Foygel, R., Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In *Advances in neural information processing systems*, pages 604–612.

Wang, H. (2014). Coordinate descent algorithm for covariance graphical lasso. *Statistics and Computing*, 24:521.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ```
# a simple example with a 3-block diagonal matrix
library(MASS)
p <- 3
n <- 300
sig <- matrix(0.8, p,p)
diag(sig) <- 1
set.seed(190188)
tmp <- replicate( 3, mvrnorm(n, rep(0,p), sig) )
x <- matrix(c(tmp), n, p*3)
fit1 <- covglasso(x)
plot(fit1$rho, fit1$BIC)
image(fit1$sigma != 0)
# refine search
fit2 <- covglasso(x, rho = seq(0.1, 0.4, length = 50))
image(fit2$sigma != 0)
fit1$bic
fit2$bic
# Cars93 data in MASS package
data("Cars93", package = "MASS")
dat <- na.omit( Cars93[,c(4:8,12:15,17,19:25)] )
fit1 <- covglasso(dat, L = 50)
# more sparse
fit2 <- covglasso(dat, L = 50,
crit = "ebic", gamma = 1)
oldpar <- par(no.readonly = TRUE)
par(mfrow = c(1,2))
plot(fit1$rho, fit1$BIC, main = "BIC")
plot(fit2$rho, fit2$BIC, main = "EBIC")
image(fit1$sigma != 0, col = c("white", "black"), main = "BIC")
image(fit2$sigma != 0, col = c("white", "black"), main = "EBIC")
par(oldpar) # reset par
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.