In unix9999/insurance: Graph-Guided Banding of the Covariance Matrix

This vignette describes basic usage of the ggb R package, which implements the methods introduced in Bien, J. (2016) "Graph-Guided Banding of the Covariance Matrix".

We begin by generating a covariance matrix that is banded with respect to a graph (as in Definition 1 of the paper).

Generating some synthetic data

We begin by generating the true covariance matrix using the function generate_gb_covariance. We imagine that the variables lie on a $5\times 4$ lattice and that any pair of variables with 2 hops of each other is correlated.

library(ggb)
g <- igraph::graph.lattice(c(5, 4))
p <- 5 * 4
b <- rep(1, p)
Sig <- generate_gb_covariance(g, b)

We now generate zero-mean multivariate normal data having this covariance matrix:

set.seed(123)
n <- 30
eig <- eigen(Sig)
A <- diag(sqrt(eig$values)) %*% t(eig$vectors)
x <- matrix(rnorm(n * p), n, p) %*% A
S <- cov(x)

image_covariance(Sig, main = "True covariance")
image_covariance(S, main = "Sample covariance")

Applying the GGB methods

Global GGB method

The function ggb can be used to fit the graph-guided banding estimators.

fit <- ggb(S, g, type = "global")

This computes the global GGB estimator along a grid of $\lambda$ values (given by fit$lambda). By default, no eigenvalue constraint is applied; however, the argument delta can be used to specify a lower bound on the eigenvalues. In this case, we see that all $\hat\Sigma_\lambda$ computed are positive definite:

min(unlist(lapply(fit$Sig, function(M) eigen(M, only.values = TRUE)$values)))

We can use plot to look at the computed $\hat\Sigma_\lambda$:

plot(fit, subset = round(seq(1, length(fit$lambda), length = 9)))

To select $\lambda$, one can use cross validation. By default, Frobenius-norm error and five folds are used, but these can be changed (see ?cv_ggb for details).

cv <- cv_ggb(x, fit, g)
plot(cv)
Sighat <- fit$Sig[[cv$ibest]]

We find that the $\lambda$ value with minimal cross validation error gives something close to the true covariance matrix:

image_covariance(Sig, main = "True covariance")
image_covariance(Sighat, main = "GGB estimate (CV)")

In this case, if we use the one-standard-error rule, we get a diagonal matrix:

image_covariance(fit$Sig[[cv$i1se]], main = "GGB estimate")

Local GGB method

We could instead fit the local GGB estimator to the data.

fit2 <- ggb(S, g, type = "local")

plot(fit2, subset = round(seq(1, length(fit$lambda), length = 9)))

cv2 <- cv_ggb(x, fit2, g)
plot(cv2)
Sighat2 <- fit2$Sig[[cv2$ibest]]

Since in truth the global structure holds, we find that the added flexibility is not very helpful in this example. Yet, it does appear that the lambda chosen by cross validation leads to a reasonable estimate of the covariance matrix. As in the previous case, we find that the one-standard-error rule selects a diagonal matrix:

image_covariance(Sig, main = "True covariance")
image_covariance(Sighat2, main = "GGB estimate (CV)")
image_covariance(fit2$Sig[[cv2$i1se]], main = "GGB estimate")