This vignette describes basic usage of the ggb
R package, which implements the methods introduced in Bien, J. (2016) "Graph-Guided Banding of the Covariance Matrix".
We begin by generating a covariance matrix that is banded with respect to a graph (as in Definition 1 of the paper).
We begin by generating the true covariance matrix using the function generate_gb_covariance
. We imagine that the variables lie on a $5\times 4$ lattice and that any pair of variables with 2 hops of each other is correlated.
library(ggb) g <- igraph::graph.lattice(c(5, 4)) p <- 5 * 4 b <- rep(1, p) Sig <- generate_gb_covariance(g, b)
We now generate zero-mean multivariate normal data having this covariance matrix:
set.seed(123) n <- 30 eig <- eigen(Sig) A <- diag(sqrt(eig$values)) %*% t(eig$vectors) x <- matrix(rnorm(n * p), n, p) %*% A S <- cov(x)
image_covariance(Sig, main = "True covariance") image_covariance(S, main = "Sample covariance")
The function ggb
can be used to fit the graph-guided banding estimators.
fit <- ggb(S, g, type = "global")
This computes the global GGB estimator along a grid of $\lambda$ values (given by fit$lambda
). By default, no eigenvalue constraint is applied; however, the argument delta
can be used to specify a lower bound on the eigenvalues. In this case, we see that all $\hat\Sigma_\lambda$ computed are positive definite:
min(unlist(lapply(fit$Sig, function(M) eigen(M, only.values = TRUE)$values)))
We can use plot
to look at the computed $\hat\Sigma_\lambda$:
plot(fit, subset = round(seq(1, length(fit$lambda), length = 9)))
To select $\lambda$, one can use cross validation. By default, Frobenius-norm error and five folds are used, but these can be changed (see ?cv_ggb
for details).
cv <- cv_ggb(x, fit, g) plot(cv) Sighat <- fit$Sig[[cv$ibest]]
We find that the $\lambda$ value with minimal cross validation error gives something close to the true covariance matrix:
image_covariance(Sig, main = "True covariance") image_covariance(Sighat, main = "GGB estimate (CV)")
In this case, if we use the one-standard-error rule, we get a diagonal matrix:
image_covariance(fit$Sig[[cv$i1se]], main = "GGB estimate")
We could instead fit the local GGB estimator to the data.
fit2 <- ggb(S, g, type = "local")
plot(fit2, subset = round(seq(1, length(fit$lambda), length = 9)))
cv2 <- cv_ggb(x, fit2, g) plot(cv2) Sighat2 <- fit2$Sig[[cv2$ibest]]
Since in truth the global structure holds, we find that the added flexibility is not very helpful in this example. Yet, it does appear that the lambda chosen by cross validation leads to a reasonable estimate of the covariance matrix. As in the previous case, we find that the one-standard-error rule selects a diagonal matrix:
image_covariance(Sig, main = "True covariance") image_covariance(Sighat2, main = "GGB estimate (CV)") image_covariance(fit2$Sig[[cv2$i1se]], main = "GGB estimate")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.