grouplasso | R Documentation |
Given a q
-dimensional random vector \mathbf{X} = (\mathbf{X}_{1},...,\mathbf{X}_{k})
with \mathbf{X}_{i}
a d_{i}
-dimensional random vector, i.e., q = d_{1} + ... + d_{k}
,
this function computes the empirical penalized Gaussian copula covariance matrix with the Gaussian log-likelihood
plus the grouped lasso penalty as objective function, where the groups are the diagonal and off-diagonal blocks corresponding to the different
random vectors.
Model selection is done by choosing omega such that BIC is maximal.
grouplasso(Sigma, S, n, omegas, dim, step.size = 100, trace = 0)
Sigma |
An initial guess for the covariance matrix (typically equal to S). |
S |
The sample matrix of normal scores covariances. |
n |
The sample size. |
omegas |
The candidate values for the tuning parameter in |
dim |
The vector of dimensions |
step.size |
The step size used in the generalized gradient descent, affects the speed of the algorithm (default = 100). |
trace |
Controls how verbose output should be (default = 0, meaning no verbose output). |
Given a covariance matrix
\boldsymbol{\Sigma} = \begin{pmatrix} \boldsymbol{\Sigma}_{11} & \boldsymbol{\Sigma}_{12} & \cdots & \boldsymbol{\Sigma}_{1k} \\
\boldsymbol{\Sigma}_{12}^{\text{T}} & \boldsymbol{\Sigma}_{22} & \cdots & \boldsymbol{\Sigma}_{2k} \\
\vdots & \vdots & \ddots & \vdots \\
\boldsymbol{\Sigma}_{1k}^{\text{T}} & \boldsymbol{\Sigma}_{2k}^{\text{T}} & \cdots & \boldsymbol{\Sigma}_{kk} \end{pmatrix},
the aim is to solve/compute
\widehat{\boldsymbol{\Sigma}}_{\text{GLT},n} \in \text{arg min}_{\boldsymbol{\Sigma} > 0} \left \{
\ln \left | \boldsymbol{\Sigma} \right | + \text{tr} \left (\boldsymbol{\Sigma}^{-1} \widehat{\boldsymbol{\Sigma}}_{n} \right ) + P_{\text{GLT}}\left (\boldsymbol{\Sigma},\omega_{n} \right ) \right \},
where the penalty function P_{\text{GLT}}
is of group lasso-type:
P_{\text{GLT}} \left (\boldsymbol{\Sigma},\omega_{n} \right ) = 2 \sum_{i,j = 1, j > i}^{k} p_{\omega_{n}} \left (\sqrt{d_{i}d_{j}} \left | \left |\boldsymbol{\Sigma}_{ij} \right | \right |_{\text{F}} \right )
+ \sum_{i = 1}^{k} p_{\omega_{n}} \left (\sqrt{d_{i}(d_{i}-1)} \left | \left | \boldsymbol{\Delta}_{i} * \boldsymbol{\Sigma}_{ii} \right | \right |_{\text{F}} \right ),
for a certain penalty function p_{\omega_{n}}
with penalty parameter \omega_{n}
, and
\boldsymbol{\Delta}_{i} \in \mathbb{R}^{d_{i} \times d_{i}}
a matrix with ones as off-diagonal elements and zeroes
on the diagonal (in order to avoid shrinking the variances, the operator *
stands for elementwise multiplication).
For now, the only possibility in this function for p_{\omega_{n}}
is the lasso penalty p_{\omega_{n}}(t) = \omega_{n} t
.
For other penalties (e.g., scad), one can do a local linear approximation to the penalty function and iteratively perform weighted group lasso optimizations (similar to what is done in the function covgpenal
).
Regarding the implementation, we used the code available in the R package ‘spcov’ (see the manual for further explanations), but altered it to the context of a group-lasso penalty.
For tuning \omega_{n}
, we maximize (over a grid of candidate values) the BIC criterion
\text{BIC} \left (\widehat{\boldsymbol{\Sigma}}_{\omega_{n}} \right ) = -n \left [\ln \left |\widehat{\boldsymbol{\Sigma}}_{\omega_{n}} \right | + \text{tr} \left (\widehat{\boldsymbol{\Sigma}}_{\omega_{n}}^{-1} \widehat{\boldsymbol{\Sigma}}_{n} \right ) \right ] - \ln(n) \text{df} \left (\widehat{\boldsymbol{\Sigma}}_{\omega_{n}} \right ),
where \widehat{\boldsymbol{\Sigma}}_{\omega_{n}}
is the estimated candidate covariance matrix using \omega_{n}
and df (degrees of freedom) equals
\hspace{-3cm} \text{df} \left (\widehat{\boldsymbol{\Sigma}}_{\omega_{n}} \right ) = \sum_{i,j = 1, j > i}^{k} 1 \left (\left | \left | \widehat{\boldsymbol{\Sigma}}_{\omega_{n},ij} \right | \right |_{\text{F}} > 0 \right ) \left (1 + \frac{\left | \left | \widehat{\boldsymbol{\Sigma}}_{\omega_{n},ij} \right | \right |_{\text{F}}}{\left | \left | \widehat{\boldsymbol{\Sigma}}_{n,ij} \right | \right |_{\text{F}}} \left (d_{i}d_{j} - 1 \right ) \right )
\hspace{2cm} + \sum_{i = 1}^{k} 1 \left (\left | \left |\boldsymbol{\Delta}_{i} * \widehat{\boldsymbol{\Sigma}}_{\omega_{n},ii} \right | \right |_{\text{F}} > 0 \right )
\left (1 + \frac{\left | \left |\boldsymbol{\Delta}_{i} * \widehat{\boldsymbol{\Sigma}}_{\omega_{n},ii} \right | \right |_{\text{F}}}{\left | \left |\boldsymbol{\Delta}_{i} * \widehat{\boldsymbol{\Sigma}}_{n,ii} \right | \right |_{\text{F}}} \left (\frac{d_{i} \left ( d_{i} - 1 \right )}{2} - 1 \right ) \right ) + q,
with \widehat{\boldsymbol{\Sigma}}_{\omega_{n},ij}
the (i,j)
'th block of \widehat{\boldsymbol{\Sigma}}_{\omega_{n}}
, similarly for \widehat{\boldsymbol{\Sigma}}_{n,ij}
.
A list with elements "est" containing the (group lasso) penalized matrix of sample normal scores rank correlations (output as provided by the function “spcov.R”), and "omega" containing the optimal tuning parameter.
De Keyser, S. & Gijbels, I. (2024). High-dimensional copula-based Wasserstein dependence. doi: https://doi.org/10.48550/arXiv.2404.07141.
Bien, J. & Tibshirani, R. (2022). spcov: sparse estimation of a covariance matrix, R package version 1.3. url: https://CRAN.R-project.org/package=spcov.
Bien, J. & Tibshirani, R. (2011). Sparse Estimation of a Covariance Matrix. Biometrika 98(4):807-820. doi: https://doi.org/10.1093/biomet/asr054.
covgpenal
for (elementwise) lasso-type estimation of the normal scores rank correlation matrix.
q = 10
dim = c(5,5)
n = 100
# AR(1) correlation matrix with correlation 0.5
R = 0.5^(abs(matrix(1:q-1,nrow = q, ncol = q, byrow = TRUE) - (1:q-1)))
# Sparsity on off-diagonal blocks
R0 = createR0(R,dim)
# Sample from multivariate normal distribution
sample = mvtnorm::rmvnorm(n,rep(0,q),R0,method = "chol")
# Normal scores
scores = matrix(0,n,q)
for(j in 1:q){scores[,j] = qnorm((n/(n+1)) * ecdf(sample[,j])(sample[,j]))}
# Sample matrix of normal scores covariances
Sigma_est = cov(scores) * ((n-1)/n)
# Candidate tuning parameters
omega = seq(0.01, 0.6, length = 50)
Sigma_est_penal = grouplasso(Sigma_est, Sigma_est, n, omega, dim)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.