title: "Covariance Structures" output: rmarkdown::html_vignette bibliography: lmer.bib vignette: > %\VignetteIndexEntry{Covariance Structures} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignetteDepends{glmmTMB}
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The current version of lme4 offers four covariance classes/structures: Covariance.us (unstructured), Covariance.diag (diagonal), Covariance.cs (compound symmetry), and Covariance.ar1 (autoregressive order 1). The syntax for use is somewhat similar to the glmmTMB package (see covariance structures in glmmTMB), although the results are slightly different.
The first half of this vignette provide a more detailed explanation of how the
new machinery works, and the second half of the vignette will compare lme4
and glmmTMB implementations and results.
For exact details of this structure, refer to the lmer vignette [@Bates_JSS]. We provide
a quick summary in this vignette.
In matrix notation, a linear mixed model can be represented as: [ \mathbf{y} = \mathbf{X} \boldsymbol \beta + \mathbf Z \mathbf{b} + \boldsymbol{\epsilon} ] Where $\mathbf{b}$ represents an unknown vector of random effects. The \code{Covariance} class helps define the structure of the variance-covariance matrix as specified as $\textrm{Var}(\mathbf{b})$. Typically, we denote the variance-covariance matrix as $\mathbf \Sigma$.
First, we create the relative co-factor $\mathbf \Lambda_{\mathbf \theta}$ which is a
$q \times q$ block diagonal matrix that depends on the variance-component
parameter $\theta$. Let $\sigma$ be the scale parameter of the variance of a
linear mixed model. In lme4, the variance-covariance matrix is constructed by:
[
{\mathbf \Sigma}{\mathbf{\theta}} =
\sigma^2 {\mathbf \Lambda}{\mathbf \theta} {\mathbf \Lambda}_{\mathbf \theta}^{\top},
]
For generalized linear mixed models, $\mathbf \Lambda$ instead represents the unscaled Cholesky factor; that is, the scaling term $\sigma^2$ is omitted from the equation above.
The major difference between the four covariance classes (Covariance.us (unstructured), Covariance.diag (diagonal), Covariance.cs (compound symmetry), and Covariance.ar1 (autoregressive order 1)) is the construction of the the relative Cholesky factor $\mathbf \Lambda_{\mathbf \theta}$.
Suppose there are $p$ number of random effect terms for a particular
grouping variable. The unstructured covariance, which is the default
in lme4, of size $p \times p$ has the following form:
[
\mathbf{\Sigma}
= \begin{bmatrix}
\sigma^{2}{1} & \sigma{12} & \dots & \sigma_{1p} \
\sigma_{21} & \sigma^{2}{2} & \dots & \sigma{2p} \
\vdots & \vdots & \ddots & \vdots \
\sigma_{p1} & \sigma_{p2} & \dots & \sigma^{2}_{p} \
\end{bmatrix}
]
The next three covariance structures can either be heterogeneous or homogeneous.
If we have a homogeneous covariance structure (hom = TRUE), then we assume
$\sigma_{1} = \sigma_{2} = \dots = \sigma_{p}$.
The diagonal covariance has the following form:
[
\mathbf{\Sigma}
= \begin{bmatrix}
\sigma^{2}{1} & 0 & \dots & 0 \
0 & \sigma^{2}{2} & \dots & 0 \
\vdots & \vdots & \ddots & \vdots \
0 & 0 & \dots & \sigma^{2}_{p} \
\end{bmatrix}
]
By default, we assume a heterogeneous diagonal covariance structure.
The compound symmetric covariance has the following form:
[
\mathbf{\Sigma}
= \begin{bmatrix}
\sigma^{2}{1} & \sigma{1}\sigma_{2}\rho & \sigma_{1}\sigma_{3}\rho & \dots & \sigma_{1}\sigma_{p}\rho \
\sigma_{2}\sigma_{1} \rho & \sigma^{2}{2} & \sigma{2}\sigma_{3}\rho & \dots & \sigma_{2}\sigma_{p}\rho \
\sigma_{3}\sigma_{1} \rho & \sigma_{3}\sigma_{2}\rho & \sigma^{2}{3} & \dots & \sigma{3}\sigma_{p}\rho \
\vdots & \vdots & \vdots & \ddots & \vdots \
\sigma_{p}\sigma_{1} \rho & \sigma_{p}\sigma_{2}\rho & \sigma_{p}\sigma_{3}\rho & \dots & \sigma^{2}_{p}
\end{bmatrix}
]
By default, we assume a heterogeneous compound symmetric covariance structure.
The AR1 (auto-regressive order 1) covariance has the following form:
[
\mathbf{\Sigma}
= \begin{bmatrix}
\sigma^{2}{1} & \sigma{1}\sigma_{2} \rho & \sigma_{1}\sigma_{3}\rho^{2} & \dots & \sigma_{1}\sigma_{p}\rho^{p-1} \
\sigma_{2}\sigma_{1} \rho & \sigma^{2}{2} & \sigma{2}\sigma_{3}\rho & \dots & \sigma_{2}\sigma_{p}\rho^{p-2} \
\sigma_{3}\sigma_{1} \rho^{2} & \sigma_{3}\sigma_{2}\rho & \sigma^{2}{3} & \dots & \sigma{3}\sigma_{p}\rho^{p-3} \
\vdots & \vdots & \vdots & \ddots & \vdots \
\sigma_{p}\sigma_{1} \rho^{p-1} & \sigma_{p}\sigma_{2}\rho^{p-2} & \sigma_{p}\sigma_{3}\rho^{p-3} & \dots & \sigma^{2}_{p}
\end{bmatrix}
]
Unlike the diagonal and compound symmetric structures, by default we assume a
homogeneous ar1 covariance structure.
For the unstructured covariance matrix, lme4 estimates the following parameters in par:
$\mathbf{\theta} = (\theta_{1}, \theta_{2}, \dots, \theta_{p(p+1)/2})$ to construct the relative
co-factor ${\mathbf \Lambda}{\mathbf{\theta}}$ (this is the same as in previous versions):
[
\Lambda{\mathbf{\theta}} = \begin{bmatrix}
\theta_1 & 0 & 0 & \dots & 0 \
\theta_2 & \theta_3 & 0 & \dots & 0 \
\theta_4 & \theta_5 & \theta_6 & \dots & 0 \
\vdots & \vdots & \vdots & \ddots & \vdots \
\theta_{...} & \theta_{...} & \theta_{...} & \dots & \theta_{p(p+1)/2}
\end{bmatrix}
]
The definition of the parameter vectors differs for the other covariance structures. In the diagonal covariance matrix
case, $\mathbf{\theta}$ (or par) only contains the standard deviations.
The relative co-factor ${\mathbf \Lambda}{\mathbf{\theta}}$ is:
[
{\mathbf Lambda}{\mathbf{\theta}} = \begin{bmatrix}
\theta_1 & 0 & 0 & \dots & 0 \
0 & \theta_2 & 0 & \dots & 0 \
0 & 0 & \theta_3 & \dots & 0 \
\vdots & \vdots & \vdots & \ddots & \vdots \
0 & 0 & 0 & \dots & \theta_{p}
\end{bmatrix}
]
For the compound symmetry covariance structure, the parameter
vector par contains the $p$ standard deviations $(\sigma_1, \sigma_2, \ldots, \sigma_p)$
and the common correlation $\rho$. In contrast to glmmTMB, the correlation is estimated on its original scale (bounded between -1 and 1), rather than on an unconstrained, transformed scale.
The relative co-factor ${\mathbf \Lambda}{\mathbf{\theta}}$ is a lower triangular $p \times p$ matrix. Consider the form: [ {\mathbf \Lambda}{\mathbf{\theta}} = \begin{bmatrix} \theta_{11} & 0 & 0 & \cdots & 0 \ \theta_{21} & \theta_{22} & 0 & \cdots & 0 \ \theta_{31} & \theta_{32} & \theta_{33} & \cdots & 0 \ \vdots & \vdots & \vdots & \ddots & \vdots \ \theta_{p1} & \theta_{p2} & \theta_{p3} & \cdots & \theta_{pp} \end{bmatrix} ]
Its elements $\theta_{ij}$ are constructed as follows. First, define the sequence ${a_j}$ recursively: [ a_1 = 0, \quad a_{j+1} = a_j + \frac{(1 - \rho \cdot a_j)^2}{1 - \rho^2 \cdot a_j} \quad \text{for } j = 1, \ldots, p-1 ]
Then the elements of ${\mathbf \Lambda}$ are given by:
[
\theta_{ij} = \begin{cases}
\sqrt{1 - \rho^2 a_j} & \text{if } i = j \text{ (diagonal)} \[0.5em]
\dfrac{\rho - \rho^2 a_j}{\sqrt{1 - \rho^2 a_j}} & \text{if } i > j \text{ (below diagonal)} \[0.5em]
0 & \text{if } i < j \text{ (above diagonal)}
\end{cases}
]
In the code, one can extract the values $\theta_{ij}$ via the getTheta() function
of the Covariance.cs object, in which theta will be a vector in the column-wise
elements of ${\mathbf \Lambda}$.
The setup is similar for the autoregressive order 1 (AR1) covariance structure. Again, the parameter vector contains the $p$ standard deviations $(\sigma_1, \sigma_2, \ldots, \sigma_p)$ and the autocorrelation parameter $\rho$. The relative co-factor ${\mathbf \Lambda}_{\mathbf{\theta}}$ is a lower triangular $p \times p$ matrix whose form is similar to the compound symmetric case.
The elements $\theta_{ij}$ are given by:
[
\theta_{ij} = \begin{cases}
\rho^{i-1} & \text{if } j = 1 \text{ (first column)} \[0.5em]
\rho^{i-j} \sqrt{1 - \rho^2} & \text{if } 1 < j \leq i \text{ (below diagonal)} \[0.5em]
0 & \text{if } i < j \text{ (above diagonal)}
\end{cases}
]
Again, these values can be extracted using getTheta() function
of the Covariance.ar1 object, in which theta will be still be a vector in
the column-wise elements of ${\mathbf \Lambda}$.
This section illustrates how to extract
par, theta, Lambda, as described in the previous section, as well as
the variance covariance matrices of a model, from a merMod object.
We'll fit the standard sleepstudy example, except that we will use a
model with a compound symmetric covariance structure. Because this model has only two varying effects (intercept and slope with respect to day) per subject, and
hence the covariance matrix is $2 \times 2$, there is no difference in overall model fit between the compound-symmetric and the unstructured covariance matrices. Howevever, the models are parameterized differently, so this example will highlight the differences between par and theta.
library(lme4) fm1.cs <- lmer(Reaction ~ Days + cs(1 + Days | Subject), sleepstudy)
Extracting the covariance structure:
print(fm1.cs_cov <- getReCovs(fm1.cs))
The result is a list with only one element as we only have one random-effects term (cs(1 + Days | Subject)). To
see the values of par and theta for this object, we can call:
getME(fm1.cs, "par") getME(fm1.cs, "theta")
The ${\mathbf \Lambda}$ matrix is large, so we'll view it instead of printing:
library(Matrix) image(getME(fm1.cs, "Lambda"))
To most users, the most crucial information is simply the variance-covariance matrices. Extracte these via VarCorr.merMod() (the list has one element per random-effect term in the model --- in this case, only one):
vc_mat <- VarCorr(fm1.cs) vc_mat$Subject
When fitting linear mixed models, lme4 parameterizes the random-effects variance–covariance matrix on an unconstrained scale, using box-constrained optimization algorithms to ensure that the variance-covariance matrix is positive semidefinite. For unstructured covariance matrices, this means that the elements of $\mathbf \theta$ that parameterize the diagonal elements of $\mathbf \Lambda$ are constrained to be $\ge 0$ (for diagonal models, all of the elements of $\mathbf \theta$ fill the diagonal of $\mathbf \Lambda$ and hence are $\ge 0$; for models such as the compound-symmetric or AR1 models that use correlation parameters, we constrain $|\rho| \le 1$. As discussed in @Bates_JSS, this constrained parameterization works well for handling model where the
estimated covariance matrix is singular (i.e. $\mathbf \Sigma$ is only positive semidefinite, not positive definite). In addition, for linear mixed models lme4 profiles the fixed-effect parameters out of the objective function [@Bates_JSS]; finally, the scale parameter $\sigma$ is not estimated directly, but is derived from the residual variance or deviance of the fitted model.
In contrast, glmmTMB uses direct maximum likelihood estimation via Template Model Builder (TMB), fitting to the full parameter vector ${\mathbf \theta, \mathbf \beta, \sigma^2}$. Covariance parameters are fitted on a transformed (unconstrained) scale: log scale for standard deviations and various scales for correlation parameters (see the glmmTMB covariance structures vignette for details). This parameterization simplifies fitting (a box-constrained algorithm isn't necessary), but is less convenient in singular fits and other cases where the maximum likelihood estimate is infinite on the unconstrained scale.
Despite these differences, we will show examples where lme4 and glmmTMB provide similar estimates when they both use maximum likelihood estimation. By default, lme4 uses the restricted maximum likelihood; hence in the following examples, we use lmer(..., REML = FALSE) to compare against glmmTMB.
if (!requireNamespace("glmmTMB", quietly = TRUE)) { knitr::opts_chunk$set(eval = FALSE) } else { library(glmmTMB) } ## Often want to ignore attributes and class. ## Set a fairly large tolerance for convenience. all.equal.nocheck <- function(x, y, tolerance = 3e-5, ..., check.attributes = FALSE, check.class = FALSE) { require("Matrix", quietly = TRUE) ## working around mode-matching headaches if (is(x, "Matrix")) x <- matrix(x) if (is(y, "Matrix")) y <- matrix(y) all.equal(x, y, ..., tolerance = tolerance, check.attributes = check.attributes, check.class = check.class) } get.cor1 <- function(x) { v <- VarCorr(x) vv <- if (inherits(x, "merMod")) v$group else v$cond$group attr(vv, "correlation")[1,2] }
This is the default setting for both lme4 and glmmTMB. Below we simulate a dataset with known beta, theta and sigma values.
n_groups <- 20 n_per_group <- 20 n <- n_groups * n_per_group set.seed(1) dat.us <- data.frame( group = rep(1:n_groups, each = n_per_group), x1 = rnorm(n), x2 = rnorm(n) ) ## Constructing a similar dataset for the other examples gdat.us <- dat.diag <- gdat.diag <- dat.us form <- y ~ 1 + x1 * x2 + us(1 + x1|group) dat.us$y <- simulate(form[-2], newdata = dat.us, family = gaussian, newparams = list(beta = c(-7, 5, -100, 20), theta = c(2.5, 1.4, 6.3), sigma = 2))[[1]] form2 <- y ~ 1 + x1 + us(1 + x1|group) gdat.us$y <- simulate( form2[-2], newdata = gdat.us, family = binomial, newparams = list( beta = c(-1.5, 0.5), theta = c(0.3, 0.1, 0.3) ))[[1]]
lme4.us <- lmer(form, data = dat.us, REML = "FALSE") glmmTMB.us <- glmmTMB(form, dat = dat.us) ## Fixed effects fixef(lme4.us); fixef(glmmTMB.us)$cond all.equal.nocheck(fixef(lme4.us), fixef(glmmTMB.us)$cond) ## Sigma sigma(lme4.us); sigma(glmmTMB.us) all.equal.nocheck(sigma(lme4.us), sigma(glmmTMB.us)) ## Log likelihoods logLik(lme4.us); logLik(glmmTMB.us) all.equal.nocheck(logLik(lme4.us), logLik(glmmTMB.us))
As expected, calculations related to the random-effects term differ slightly beyond this point.
## Variance-Covariance Matrix vcov(lme4.us); vcov(glmmTMB.us)$cond all.equal.nocheck(vcov(lme4.us), vcov(glmmTMB.us)$cond) ## Variance and Covariance Components all.equal.nocheck(VarCorr(lme4.us)$group, VarCorr(glmmTMB.us)$cond$group) ## Conditional Modes of the Random Effects all.equal.nocheck(ranef(lme4.us)$group, ranef(glmmTMB.us)$cond$group)
glme4.us <- glmer(form2, data = gdat.us, family = binomial) gglmmTMB.us <- glmmTMB(form2, dat = gdat.us, family = binomial) ## Fixed effects fixef(glme4.us); fixef(gglmmTMB.us)$cond all.equal.nocheck(fixef(glme4.us), fixef(gglmmTMB.us)$cond) ## Sigma all.equal.nocheck(sigma(glme4.us), sigma(gglmmTMB.us)) ## Log likelihoods logLik(glme4.us); logLik(gglmmTMB.us) all.equal.nocheck(logLik(glme4.us), logLik(gglmmTMB.us))
As expected, calculations related to the random-effects term differ slightly beyond this point.
## Variance-Covariance Matrix vcov(glme4.us); vcov(gglmmTMB.us)$cond all.equal.nocheck(vcov(glme4.us), vcov(gglmmTMB.us)$cond) ## Variance and Covariance Components all.equal.nocheck(VarCorr(glme4.us)$group, VarCorr(gglmmTMB.us)$cond$group) ## Conditional Modes of the Random Effects all.equal.nocheck(ranef(glme4.us)$group, ranef(gglmmTMB.us)$cond$group)
The syntax is the same for fitting a heterogeneous diagonal covariance structure for lme4 and glmmTMB. It changes when we want to fit a homogeneous diagonal covariance structure.
To fit a homogeneous diagonal covariance structure we would write:
lme4.us <- lmer(Reaction ~ Days + diag(Days | Subject, hom = TRUE), sleepstudy) glmmTMB.us <- glmmTMB(Reaction ~ Days + homdiag(Days | Subject), sleepstudy)
We will focus on comparisons of an estimated heterogeneous diagonal covariance structure.
form <- y ~ 1 + x1 * x2 + diag(1|group) dat.diag$y <- simulate(form[-2], newdata = dat.diag, family = gaussian, newparams = list(beta = c(-7, 5, -100, 20), theta = c(2.5), sigma = 2))[[1]]
lme4.diag <- lmer(form, data = dat.diag, REML = "FALSE") glmmTMB.diag <- glmmTMB(form, dat = dat.diag) ## Fixed effects fixef(lme4.diag); fixef(glmmTMB.diag)$cond all.equal.nocheck(fixef(lme4.diag), fixef(glmmTMB.diag)$cond) ## Sigma sigma(lme4.diag); sigma(glmmTMB.diag) all.equal.nocheck(sigma(lme4.diag), sigma(glmmTMB.diag)) ## Log likelihoods logLik(lme4.diag); logLik(glmmTMB.diag) all.equal.nocheck(logLik(lme4.diag), logLik(glmmTMB.diag)) ## Variance-Covariance Matrix vcov(lme4.diag); vcov(glmmTMB.diag)$cond all.equal.nocheck(vcov(lme4.diag), vcov(glmmTMB.diag)$cond) ## Variance and Covariance Components all.equal.nocheck(VarCorr(lme4.diag)[[1]], VarCorr(glmmTMB.diag)$cond$group) ## Conditional Modes of the Random Effects all.equal.nocheck(ranef(lme4.diag)$group, ranef(glmmTMB.diag)$cond$group)
Similar to the diagonal case, the syntax is the same for fitting a heterogeneous compound symmetry covariance structure for lme4 and glmmTMB:
lme4.us <- lmer(Reaction ~ Days + cs(Days | Subject, hom = TRUE), sleepstudy) glmmTMB.us <- glmmTMB(Reaction ~ Days + cs(Days | Subject), sleepstudy)
Again, it differs when we want to fit a homogeneous compound symmetry covariance structure, which we will use for our comparisons.
simGroup <- function(g, n=6, phi=0.6) { x <- MASS::mvrnorm(mu = rep(0,n), Sigma = phi^as.matrix(dist(1:n)) ) y <- x + rnorm(n) times <- factor(1:n) group <- factor(rep(g,n)) data.frame(y, times, group) } set.seed(1) dat.cs <- do.call("rbind", lapply(1:2000, simGroup))
lme4.cs <- lmer(y ~ times + cs(0 + times | group, hom = TRUE), data = dat.cs, REML = FALSE) glmmTMB.cs <- glmmTMB(y ~ times + homcs(0 + times | group), data = dat.cs) ## Fixed effects fixef(lme4.cs); fixef(glmmTMB.cs)$cond all.equal.nocheck(fixef(lme4.cs), fixef(glmmTMB.cs)$cond) ## Sigma sigma(lme4.cs); sigma(glmmTMB.cs) all.equal.nocheck(sigma(lme4.cs), sigma(glmmTMB.cs)) ## Log likelihoods logLik(lme4.cs); logLik(glmmTMB.cs) all.equal.nocheck(logLik(lme4.cs), logLik(glmmTMB.cs)) ## Variance-Covariance Matrix all.equal.nocheck(vcov(lme4.cs), vcov(glmmTMB.cs)$cond) ## Variance and Covariance Components all.equal.nocheck(VarCorr(lme4.cs)[[1]], VarCorr(glmmTMB.cs)$cond$group) ## Conditional Modes of the Random Effects all.equal.nocheck(ranef(lme4.cs)$group, ranef(glmmTMB.cs)$cond$group) ## Comparing against the predicted rho value lme4.rho <- get.cor1(lme4.cs) glmmTMB.rho <- get.cor1(glmmTMB.cs) lme4.rho; glmmTMB.rho all.equal.nocheck(lme4.rho, glmmTMB.rho)
For this comparison, we focus on a simulated data set with $\rho = 0.7$.
set.seed(1) dat.ar1 <- do.call("rbind", lapply(1:2000, function(g) simGroup(g, phi = 0.7)))
Unlike the diagonal and compound symmetry case, the syntax differs for fitting either a heterogeneous or a homogeneous AR1 model for lme4 and glmmTMB.
For a heterogeneous AR1 covariance structure we would write the following:
lme4.ar1 <- lmer(y ~ times + ar1(0 + times | group), data = dat.ar1, REML = FALSE) glmmTMB.ar1 <- glmmTMB(y ~ times + hetar1(0 + times | group), data = dat.ar1)
We will instead focus on comparisons for a homogeneous AR1 covariance structure.
lme4.ar1 <- lmer(y ~ times + ar1(0 + times | group, hom = TRUE), data = dat.ar1, REML = FALSE) glmmTMB.ar1 <- glmmTMB(y ~ times + ar1(0 + times | group), data = dat.ar1) ## Fixed effects fixef(lme4.ar1); fixef(glmmTMB.ar1)$cond all.equal.nocheck(fixef(lme4.ar1), fixef(glmmTMB.ar1)$cond) ## Sigma sigma(lme4.ar1); sigma(glmmTMB.ar1) all.equal.nocheck(sigma(lme4.ar1), sigma(glmmTMB.ar1)) ## Log likelihoods logLik(lme4.ar1); logLik(glmmTMB.ar1) all.equal.nocheck(logLik(lme4.ar1), logLik(glmmTMB.ar1)) ## Variance-Covariance Matrix all.equal.nocheck(vcov(lme4.ar1), vcov(glmmTMB.ar1)$cond) ## Variance and Covariance Components all.equal.nocheck(VarCorr(lme4.ar1)$group, VarCorr(glmmTMB.ar1)$cond$group) ## Conditional Modes of the Random Effects all.equal.nocheck(ranef(lme4.ar1)$group, ranef(glmmTMB.ar1)$cond$group) ## Comparing against the predicted rho value lme4.ar1.rho <- get.cor1(lme4.ar1) glmmTMB.ar1.rho <- get.cor1(glmmTMB.ar1) lme4.ar1.rho; glmmTMB.ar1.rho all.equal.nocheck(lme4.ar1.rho, glmmTMB.ar1.rho)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.