Fitting variance component models
Description
varComp
and varComp.fit
fit linear mixedeffect models where the marginal variancecovariance matrix is linear in known positive semidefinite matrices. varComp
uses the usual formula interface, whereas varComp.fit
is the underlying working horse. doFit.varComp
performs model fitting if the object has been previously created by setting doFit = FALSE
when calling varComp
.
Usage
1 2 3 4 5 6 7 8 9  varComp(fixed, data, random, varcov, weights, subset,
family = stats::gaussian('identity'), na.action, offset,
control = varComp.control(...), doFit = TRUE,
normalizeTrace = TRUE, contrasts = NULL,
model = TRUE, X = TRUE, Y = TRUE, K = TRUE, ...)
varComp.fit(Y, X = matrix(0, length(Y), 0L), K, control = varComp.control())
doFit.varComp(object)

Arguments
fixed 
A twosided formula specifying the fixed effects of the model. This is the same as in 
data 
an optional data frame, list or environment containing the variables in the model. If not found in data, the variables are taken from 
random 
An optional onesided formula specifying additive random effects, of the form 
varcov 
An optional list of symmetric positive semidefinite matrices. If 
weights 
An optional nonnegative vector of the same length as the response variable specified in 
subset 
An optional vector specifying a subset of observations to be used in the fitting process. 
family 
The same as the 
na.action 
The same as in 
offset 
The same as in 
control 
An object from calling 
doFit 
A logical scalar, indicating whether model fitting should be performed. 
normalizeTrace 
A logical scalar, indicating whether the individual variancecovariance matrices should be normalized such that variance components are on the same scale. 
contrasts 
The same as in 
model 
A logical scalar, indicating whether the model frame will be included in the result. 
X 
For 
Y 
For 
K 
For 
... 
When 
object 
An object of class 
Details
The variance component model is of form
Y=X β + e
where e is multivariate normally distributed with mean zero and variancecovariance matrix V being
V= σ{j=1}^R σ_j^2 K_j + σ_e^2 W
in which K_j are known positive semidefinite matrices and W is a known diagonal positive definite matrix. In the case of randomeffect modeling, the K matrices are further given by
K_j = Z_j G_j Z_j^T
where Z_j is the design matrix for the jth randomeffect factor and G_j the known variancecovariance matrix for the jth randomeffect vector.
In the varComp
formula interface, the X matrix and response variable are specified by the fixed
argument. The optional random
argument specifies the Z matrices. When random
is missing, they are assumed to be identity matrices. The varcov
argument specifies the G matrices. The weights
argument specifies the W matrix.
Note that in random
, kernel functions are allowed. For example, random = ~ ibs(SNP)
is allowed to specify the similarities contributed by a matrix SNP
through the identitybydescent (IBS
) kernel.
Fixedbyrandom interactions are allowed in random
. If F
is a fixed factor with 2 levels, and F
has already appeared in the fixed
argument, then random = ~ ibs(SNP) + F:ibs(SNP)
will specify two random effects. The first contributes to the marginal variancecovariance matrix through IBS(SNP)
; the other contributes to the marginal variancecovariance matrix through tcrossprod(X_F2) * IBS(SNP)
, where X_F2
is the 2nd column of the fixedeffect design matrix of factor F
under the sumtozero contrast. Note that sumtozero contrast will be used even if another contrast is used for F
in the fixed
argument. This behavior corresponds to what we usually mean by fixedbyrandom interaction. If one indeed needs to avoid sumtozero contrast in this case, it is only possible to do so by providing varcov
directly without specifying random
.
Additionally, the fixedbyrandom interactions in random
specified by the “:
” symbol are actually treated as if they are specified by “*
”. In other words, random = ~ ibs(SNP) + F:ibs(SNP)
, random = ~ F:ibs(SNP)
, and random = ~ F*ibs(SNP)
are actually treated as equivalent when F
has already appeared on the rightside of fixed
. Again this corresponds to the usual notion of fixedbyrandom interaction. If such behavior is not wanted, one can directly give the intended varcov
and leave random
as missing.
Finally, intercept is not included in the random
interface.
The model fitting process uses profiled restricted maximum likelihood (PREML), where the error variance is always profiled out of the REML likelihood. The nonnegativity constraints of variance components are always imposed. See varComp.control
for arguments controlling the modeling fitting.
Value
The value of any of the three functions is a list with class varComp
. Depending on whether doFit
is TRUE
or FALSE
, the components in the list might differ.
In either case, the result from varComp
always contains the following elements:

na.action
: thena.action
used in the model frame. 
offset
: the same as input. 
contrasts
: the contrast used in the fixedeffect design matrix. 
xzlevels
: the levels of both fixedeffect and randomeffect factors. 
terms
: both fixedeffect and randomeffect terms. 
call
: the actual call. 
nobs
: the number of observations. 
control
: the same as input. 
random.labels
: the labels used to differentiate random effects. Note that the error variance is not included here. It is safe to check the number of variance components specified by the model by checking the length ofrandom.labels
. 
doFit
: The value ofdoFit
is eitherTRUE
or a call. If thedoFit
argument ofvarComp
is set toFALSE
, this component will be a call that will do the model fitting when being evaluated. If the model fitting has already been done, this component will always beTRUE
.
In the case where doFit
is TRUE
, the following components will also be present in the result of varComp
and doFit
:

parms
: a named numeric vector of parameter estimates. These are the estimated ratio of each variance component relative to the error variance. 
gradients
: a named numeric vector of gradient of the PREML function at the final parameter estimate. Because of nonnegativity constraints, the gradients are not necessarily all zero at convergence. 
hessian
: a numeric matrix of Hessian matrix at parameter estimate. This is always computed through the observed information, no matter how theinformation
argument ofvarComp.control
is set. 
sigma2
: the estimated error variance. 
varComps
: a named numeric vector of variance components. These are the same asparms
timessigma2
. 
n.iter
: the number of iterations during fitting. 
PREML
: the final maximized PREML function value. 
X.Q2
: a matrix, when leftmultiplied to the response variable, produces the residual contrasts. 
residual.contrast
: a numeric vector of residual contrasts withlength{Y}  qr(X)$rank
elements. 
working.cor
: the list of individual variancecovariance matrices, whose weighted sum is the marginal variancecovariance matrix of the residual contrast.
If varComp.fit
is called directly, then doFit
is always TRUE
, but the result will not contain any of na.action
, offset
, contrasts
, xzlevels
, and terms
, as they do not apply.
For the varComp
interface, if any of model
, X
, Y
or K
is TRUE
, the corresponding part will be included in the result. If weights
is nonmissing, it will be included in the result.
Note
Currently, available S3 methods for the varComp
class are:

coef
: retrieving fixedeffect estimates, variance component estimates, or estimates of the ratios of variance components to the error variance. 
model.matrix
: retrieving the fixedeffect design matrix, equivalent randomeffect design matrix, or the list of individual variancecovariance matrices. 
vcov
: computing variancecovariance matrix of fixedeffect parameters, variance parameters, or the response variable. 
fixef
: retrieving fixedeffect parameter estimates with standard errors and default hypothesis tests against zero. See alsosatterth.varComp
andKR.varComp
for computing approximate denominator of degrees of freedom for the Fstatistics. 
anova
: similar tofixef
, but returning a data frame. 
varComp.test
: testing for nullity of random components (other than the error variance). 
logLik
: retrieving the maximized PREML value with degrees of freedom. 
print
: showing the object. 
summary
: summarize the object by providing standard errors and default tests against zero.
Author(s)
Long Qu
References
Qu L, Guennel T, Marshall SL. (2013) Linear Score Tests for Variance Components in Linear Mixed Models and Applications to Genetic Association Studies. Biometrics, Volume 69, Issue 4, pages 883–892.
See Also
varComp.test
, coef.varComp
, model.matrix.varComp
, vcov.varComp
, fixef.varComp
, satterth.varComp
, KR.varComp
, logLik.varComp
, print.varComp
, summary.varComp
, nlme::lme
, stats::lm
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48  ### Oxide/Semiconductor data example
library(nlme)
data(Oxide)
lmef = lme(Thickness~Source, Oxide, ~1Lot/Wafer)
vcf = varComp(Thickness~Source, Oxide, ~Lot/Wafer)
VarCorr(lmef)
coef(vcf, 'varComp') ## same values as above
k0=tcrossprod(model.matrix(~0+Lot,Oxide))
k1=tcrossprod(Oxide$Source==1)*k0
k2=tcrossprod(Oxide$Source==2)*k0
k3=tcrossprod(model.matrix(~0+Lot:Wafer, Oxide))
## unequal variance across Source for Lot effects, in a preferred parameterization:
(vcf1 = varComp(Thickness~Source, Oxide, varcov=list(S1Lot=k1, S2Lot=k2, `Lot:Wafer`=k3)))
## unequal variance across Source for Lot effects, in a different parameterization:
(vcf2 = varComp(Thickness~Source, Oxide, varcov=list(Lot=k0, S2Lot=k2, `Lot:Wafer`=k3)))
## unequal variance across Source for Lot effects, but in a poor parameterization that
## turns out to be the same as vcf after fitting.
(vcf3 = varComp(Thickness~Source, Oxide, varcov=list(Lot=k0, S1Lot=k1, `Lot:Wafer`=k3)))
logLik(vcf)
logLik(vcf1)
logLik(vcf2) ## the same as vcf1
logLik(vcf3) ## the same as vcf
## fixefeffect only
vcf0 = varComp(Thickness~Source, Oxide)
summary(vcf0)
summary(lmf<lm(Thickness~Source, Oxide))
vcf00 = varComp(Thickness~0, Oxide)
summary(vcf00)
summary(lmf0<lm(Thickness~0, Oxide))
### Genetics example
trt=gl(2, 15)
set.seed(2340)
dat=data.frame(trt=trt)
dat$SNP=matrix(sample(0:2, 120, replace=TRUE), 30)
dat$Y = as.numeric(trt)+rnorm(30) + dat$SNP%*%rnorm(4)
(vcf0 = varComp(Y~trt, dat, ~ibs(SNP)))
(vcf00 = varComp(Y~trt, dat, varcov = list(`ibs(SNP)`=IBS(dat$SNP)))) ## same as above
(vcf1 = varComp(Y~trt, dat, ~ibs(SNP):trt)) ## two variance components
dat$trt[1]=NA
varComp(Y~trt, dat, ~ibs(SNP)) ## 29 observations compared to 30 in vcf0
dat$SNP[2,1]=NA
varComp(Y~trt, dat, ~ibs(SNP)) ## still, 29 observations, as ibs handles sporadic NA
dat$SNP[3, ]=NA
varComp(Y~trt, dat, ~ibs(SNP)) ## 28 observations, as no genotype for the 3rd obs
