knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "100%" )
This repository provides an implementation of the MGLasso
(Multiscale Graphical Lasso) algorithm: an approach for estimating sparse Gaussian Graphical Models with the addition of a group-fused Lasso penalty.
MGLasso
is described in the paper Inference of Multiscale Gaussian Graphical Model. MGLasso
has these contributions:
We simultaneously infer a network and estimate a clustering structure by combining the neighborhood selection approach (Meinshausen and Bühlman, 2006) and convex clustering (Hocking et al. 2011).
We use a continuation with Nesterov smoothing in a shrinkage-thresholding algorithm (CONESTA
, Hadj-Selem et al. 2018) to solve the optimization problem.
To solve the MGLasso
problem, we seek the regression vectors $\beta^i$ that minimize
$J_{\lambda_1, \lambda_2}(\boldsymbol{\beta}; \mathbf{X} ) =
\frac{1}{2}
\sum_{i=1}^p
\left \lVert
\mathbf{X}^i - \mathbf{X}^{\setminus i} \boldsymbol{\beta}^i
\right \rVert_2 ^2 +
\lambda_1
\sum_{i = 1}^p
\left \lVert
\boldsymbol{\beta}^i \right \rVert_1 +
\lambda_2
\sum_{i < j}
\left \lVert
\boldsymbol{\beta}^i - \tau_{ij}(\boldsymbol{\beta}^j)
\right \rVert_2.$
MGLasso
package is based on the python implementation of the solver CONESTA
available in pylearn-parsimony library.
reticulate
package and Miniconda if no conda distribution available on the OS.install.packages('reticulate') reticulate::install_miniconda()
MGLasso
, its python dependencies and configure the conda environment rmglasso
.# install.packages('mglasso') remotes::install_github("desanou/mglasso")
library(mglasso) install_pylearn_parsimony(envname = "rmglasso", method = "conda") reticulate::use_condaenv("rmglasso", required = TRUE) reticulate::py_config()
The conesta_solver
is delay loaded. See reticulate::import_from_path
for details.
An example of use is given below.
We simulate a $3$-block diagonal model where each block contains $3$ variables. The intra-block correlation level is set to $0.85$ while the correlations outside the blocks are kept to $0$.
library(Matrix) n = 50 K = 3 p = 9 rho = 0.85 blocs <- list() for (j in 1:K) { bloc <- matrix(rho, nrow = p/K, ncol = p/K) for(i in 1:(p/K)) { bloc[i,i] <- 1 } blocs[[j]] <- bloc } mat.correlation <- Matrix::bdiag(blocs) corrplot::corrplot(as.matrix(mat.correlation), method = "color", tl.col="black")
set.seed(11) X <- mvtnorm::rmvnorm(n, mean = rep(0,p), sigma = as.matrix(mat.correlation)) colnames(X) <- LETTERS[1:9]
mglasso()
We set the sparsity level $\lambda_1$ to $0.2$ and rescaled it with the size of the sample.
X <- scale(X) res <- mglasso(X, lambda1 = 0.2*n, lambda2_start = 0.1, fuse_thresh = 1e-3, verbose = FALSE)
To launch a unique run of the objective function call the conesta
function.
temp <- mglasso::conesta(X, lam1 = 0.2*n, lam2 = 0.1)
We plot the clustering path of mglasso
method on the 2 principal components axis of $X$. The path is drawn on the predicted $X$'s.
library(ggplot2) library(ggrepel) mglasso:::plot_clusterpath(as.matrix(X), res)
As the the fusion penalty increases from level9
to level1
we observe a progressive fusion of adjacent edges.
plot_mglasso(res)
Edmond, Sanou; Christophe, Ambroise; Geneviève, Robin; (2022): Inference of Multiscale Gaussian Graphical Model. ArXiv. Preprint. https://doi.org/10.48550/arXiv.2202.05775
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.