rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{pdInd: G matrix with patterns of zeros} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8}
knitr::opts_chunk$set( collapse = FALSE, comment = " " ) library(gnew) library(spida2) library(nlme)
pdInd
is a constructor for pdClasses
that define G matrices to model
the variance of random effects for models in the \pkg{nlme} package.
Mixed models in which many predictors have random slopes often fail to converge in part because of the large number of parameters in the full covariance (G) matrix for random effects. One way of fitting a more parsimonious model that includes random slopes is to use \code{\link{pdDiag}} with zeros off the diagonal. However, this also forces zero covariances between random slopes and and the random intercept, resulting in a model that is not equivariant with respect to location transformations of the predictors with random slopes. The alternative remedy of omitting random slopes for some predictors can lead to biased estimates and incorrect standard errors of regression coefficients.
The default covariance pattern for \code{pdInd} produces a G matrix with zero covariances except in the first row and column. If the first random effect is the intercept, the resulting model assumes independence between random slopes without imposing minimality of variance over the possibly arbitrary origin. This imposition is the reason that having all covariances equal to zero results in a model that fails to be equivariant under location transformations.
The optional \code{cov} parameter can be used to allow selected non-zero covariance between random slopes.
For example, if two variables, X1 and X2 have random effects, the
random effects model would be specified in a call to lme
as
random = ~ 1 + X1 + X2
.
The default G matrix has the form: $$G = \begin{pmatrix} g_{00} & g_{01} & g_{02} \ g_{10} & g_{11} & g_{12} \ g_{20} & g_{21} & g_{22} \end{pmatrix}$$
With pdDiag
, all the off-diagonal elements of $G$ are constrained to 0. Forcing $g_{01}$ and $g_{02}$ to be 0 produces a model that is not equivariant with respect to location changes in X1
and X2
.
The value at which the variance
of Y
given X1
and X2
is minimized is forced to be 0 for both variables.
However, constraining $g_{12} = 0$ produces a model that is equivariant with respect
to location-scale transformation of X1
and X2
and in which
the random between cluster values of regression slopes for each
variable are independent of each other.
The pdInd
class of positive-definite
matrices creates, by default, a matrix with arbitrary values along
the diagonal and in the first row and column, but zeros elsewhere.
In the case of a $4 \times 4$ matrix, this produces:
$$G = \begin{pmatrix} g_{00} & g_{01} & g_{02} & g_{03}\ g_{10} & g_{11} & 0 & 0\ g_{20} & 0 & g_{22} & 0\ g_{30} & 0 & 0 & g_{33} \end{pmatrix}$$
The challenge in parametrizing the G matrix is finding an unconstrained parametrization that results in a positive-definite matrix with selected covariances constrained to 0.
We consider a right-Cholesky decomposition noting that the diagonal component in the following factorization results in a diagonal component in the variance matrix.
$$\begin{pmatrix} G_{11} & G_{12} \ G'{12} & G{22} \end{pmatrix} = \begin{pmatrix} R_{11} & R_{12} \ 0 & D_{22} \end{pmatrix} \begin{pmatrix} R'{11} & 0\ R'{12} & D_{22} \end{pmatrix} = \begin{pmatrix} R_{11} R'{11} + R{12}R'{12} & R{12} D_{22} \ D_{22} R'{12} & D^2{22} \end{pmatrix} $$ With $R_{11}$ square upper-triangular and $D_{22}$ square diagonal, $G_{22}$ must be diagonal. In addition, patterns of zeros in the $R_{12}$ matrix, lying above the diagonal $D_{22}$ matrix, are preserved in $G_{12}$.
Note that the unconstrained Cholesky parametrization uses the log of the diagonal elements of the triangular factor.
(fac <- cbind( c(1,0,0,0,0), c(1,2,0,0,0), c(0,1,3,0,0), c(1,0,0,4,0), c(0,1,0, 0, 5) )) fac %*% t(fac)
Note how the pattern of zeros in the last three columns of the first two rows is preserved in the cross product since the lower $3 \times 3$ diagonal block matrix is itself diagonal.
However, if the lower $3 \times 3$ block diagonal matrix is not diagonal, then the pattern of zeros in the top two rows above it is not necessarily preserved.
(fac <- cbind( c(1,0,0,0,0), c(1,2,0,0,0), c(0,1,3,0,0), c(1,0,0,4,0), c(0,1,0, -1, 5) )) fac %*% t(fac)
But patterns of zeros above non-diagonal blocks are not necessarily preserved.
The cov
parameter allows the user to specify a pattern of zeros in the upper
triangle of the upper-triangular 'R' factor of the the G matrix. As observed above,
in some cases this will result in the same pattern in the G matrix. Even if the
pattern in the R factor does not create a similar pattern in the G matrix,
the model for the G matrix will nevertheless have the number of additional
parameters for covariance as given by the TRUE
entries in the upper diagonal of the
cov
matrix.
We will show how to use chol(getG(fit))
to suggest transformations of predictors
that appear in random effects model to help improve the convergence of mixed models fits.
We conjecture that the condition number of the Hessian matrix for the parameters in the G matrix may be approximately the square of the condition number for the G matrix. Thus a condition number for G in the vicinity of $10^7$ would effectively result in singularity of the Hessian.
Rescaling and relocating the predictors with random slopes can greatly improve ill-conditioning of the G matrix.
Let
$$\mathbf{Z} = \begin{pmatrix} 1 \ Z_1 \ \vdots \ Z_k \end{pmatrix}$$ represent the vector of variables with random effects $\mathbf{u}$. For the $i$th cluster the contribution from level-2 random effects is: $$\mathbf{Z}'_i \mathbf{u}_i$$ Consider a location-scale tranformation of the variables in $\mathbf{Z}$. It has the form $\mathbf{Z^} = T\mathbf{Z}$ where $T$ is upper triangular with the form $$ T = \begin{bmatrix} 1 & a_1 \;a_2 \; \cdots \; a_k \ 0 & B \end{bmatrix}$$ with $B$ a diagonal matrix containing scaling coefficients, $b_1, b_2, ..., b_k$, so that $$Z^_i = a_i + b_i Z_i$$ If $G = Var(\mathbf{u})$ then $$G^ = T'^{-1}GT^{-1}$$ where $G^ = \operatorname{Var}(\mathbf{u^})$ with $$\mathbf{Z}_i' \mathbf{u}_i = {\mathbf{Z}^{}_i}' \mathbf{u}^*_i$$
Thefore if we factor
$$G =T'T$$
the upper triangular matrix
$\frac{1}{t_{11}}T$ provides a transformation of $\mathbf{Z}$ that
minimizes the condition number of $G^*$. In R, this is simply getG(fit) %>% chol %>% {./.[1,1]}
.
The initial call to pdConstruct.pdInd
occurs in the initialization phase of lme
which call reStruct
which in turn calls pdMat
that creates an empty pdInd
object, sets value <- numeric(0)
and returns:
pdConstuct(object, value, form, nam, data)
methods(class='pdInd') gnew:::pdInd gnew:::pdConstruct.pdInd gnew:::pdFactor.pdInd gnew:::pdMatrix.pdInd gnew:::solve.pdInd
methods(class='pdMat') nlme:::pdMat nlme:::pdConstruct.pdMat nlme:::pdMatrix.pdMat nlme:::pdFactor.pdMat nlme:::solve.pdMat nlme:::VarCorr.pdMat nlme:::as.matrix.pdMat nlme:::`matrix<-.pdMat` nlme:::coef.pdMat nlme:::`coef<-.pdMat`
nlme:::pdSymm methods(class='pdSymm') nlme:::pdConstruct.pdSymm nlme:::pdMatrix.pdSymm nlme:::pdFactor.pdSymm nlme:::solve.pdSymm nlme:::coef.pdSymm
nlme:::lme.formula nlme:::reStruct
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.