Mixed-model solver

Description

Calculates maximum-likelihood (ML/REML) solutions for mixed models of the form

y = X β + Z u + \varepsilon

where β is a vector of fixed effects and u is a vector of random effects with Var[u] = K σ^2_u. The residual variance is Var[\varepsilon] = I σ^2_e. This class of mixed models, in which there is a single variance component other than the residual error, has a close relationship with ridge regression (ridge parameter λ = σ_e^2 / σ^2_u).

Usage

1
2
mixed.solve(y, Z=NULL, K=NULL, X=NULL, method="REML", 
        bounds=c(1e-09, 1e+09), SE=FALSE, return.Hinv=FALSE)

Arguments

y

Vector (n \times 1) of observations. Missing values (NA) are omitted, along with the corresponding rows of X and Z.

Z

Design matrix (n \times m) for the random effects. If not passed, assumed to be the identity matrix.

K

Covariance matrix (m \times m) for random effects; must be positive semi-definite. If not passed, assumed to be the identity matrix.

X

Design matrix (n \times p) for the fixed effects. If not passed, a vector of 1's is used to model the intercept. X must be full column rank (implies β is estimable).

method

Specifies whether the full ("ML") or restricted ("REML") maximum-likelihood method is used.

bounds

Array with two elements specifying the lower and upper bound for the ridge parameter.

SE

If TRUE, standard errors are calculated.

return.Hinv

If TRUE, the function returns the inverse of H = Z K Z' + λ I. This is useful for GWAS.

Details

This function can be used to predict marker effects or breeding values (see examples). The numerical method is based on the spectral decomposition of Z K Z' and S Z K Z' S, where S = I - X (X' X)^{-1} X' is the projection operator for the nullspace of X (Kang et al., 2008). This algorithm generates the inverse phenotypic covariance matrix V^{-1}, which can then be used to calculate the BLUE and BLUP solutions for the fixed and random effects, respectively, using standard formulas (Searle et al. 1992):

BLUE(β) = β^* = (X'V^{-1}X)^{-1}X'V^{-1}y

BLUP(u) = u^* = σ^2_u KZ'V^{-1}(y-Xβ^*)

The standard errors are calculated as the square root of the diagonal elements of the following matrices (Searle et al. 1992):

Var[β^*] = (X'V^{-1}X)^{-1}

Var[u^*-u] = K σ^2_u - σ^4_u KZ'V^{-1}ZK + σ^4_u KZ'V^{-1}XVar[β^*]X'V^{-1}ZK

For marker effects where K = I, the function will run faster if K is not passed than if the user passes the identity matrix.

Value

If SE=FALSE, the function returns a list containing

$Vu

estimator for σ^2_u

$Ve

estimator for σ^2_e

$beta

BLUE(β)

$u

BLUP(u)

$LL

maximized log-likelihood (full or restricted, depending on method)

If SE=TRUE, the list also contains

$beta.SE

standard error for β

$u.SE

standard error for u^*-u

If return.Hinv=TRUE, the list also contains

$Hinv

the inverse of H

References

Kang et al. 2008. Efficient control of population structure in model organism association mapping. Genetics 178:1709-1723.

Endelman, J.B. 2011. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4:250-255.

Searle, S.R., G. Casella and C.E. McCulloch. 1992. Variance Components. John Wiley, Hoboken.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#random population of 200 lines with 1000 markers
M <- matrix(rep(0,200*1000),200,1000)
for (i in 1:200) {
  M[i,] <- ifelse(runif(1000)<0.5,-1,1)
}

#random phenotypes
u <- rnorm(1000)
g <- as.vector(crossprod(t(M),u))
h2 <- 0.5  #heritability
y <- g + rnorm(200,mean=0,sd=sqrt((1-h2)/h2*var(g)))

#predict marker effects
ans <- mixed.solve(y,Z=M)  #By default K = I
accuracy <- cor(u,ans$u)

#predict breeding values
ans <- mixed.solve(y,K=A.mat(M))
accuracy <- cor(g,ans$u)