In timflutre/rutilstimflutre: Timothee Flutre's personal R code

knitr::opts_chunk$set(echo=TRUE)

Linear mixed model

\begin{align} \boldsymbol{y} = X \boldsymbol{\beta} + Z \boldsymbol{u} + \boldsymbol{\epsilon} \end{align}

where:

$\boldsymbol{y}$: $n \times 1$ vector of observations
$X$: $n \times p$ design matrix for fixed-effect factors, of rank $\nu$
$\boldsymbol{\beta}$: $p \times 1$ vector of fixed effects
$Z$: $n \times q$ design matrix for $b$ random-effect factors
- $Z = [Z_1, \ldots, Z_b]$ where $Z_i$ is the $n \times q_i$ design matrix of the $i$th random-effect factor $\boldsymbol{u}i$, whith $q = \sum{i=1}^b q_i$
$\boldsymbol{u}$: $q \times 1$ vector of random effects
- $\boldsymbol{u} = [\boldsymbol{u}_1^T, \ldots, \boldsymbol{u}_b^T]^T$ where $\boldsymbol{u}_i$ is the $q_i \times 1$ vector of random effects for the $i$th random-effect factor
$\boldsymbol{\epsilon}$: vector of $n$ residuals

The fixed effects model the mean of $\boldsymbol{y}$ and the random effects model its variance.

Variance-ratio parametrization

Assumptions about the (co)variances:

$\sigma^2$: scalar corresponding to the residual variance, useful for overall scaling
$var(\boldsymbol{u}) = \sigma^2 \, G(\gamma)$
- $\gamma$: $r \times 1$ vector of variance parameters parametrizing $G$
$var(\boldsymbol{\epsilon}) = \sigma^2 \, R(\phi)$
- $\phi$: $s \times 1$ vector of variance parameters parametrizing $R$
$cov(\boldsymbol{u}, \boldsymbol{\epsilon}) = 0$

Assumptions about the distributions:

$\boldsymbol{u} | \sigma^2, \gamma \sim \mathcal{N}(\boldsymbol{0}, \sigma^2 \, G(\gamma))$
$\boldsymbol{\epsilon} | \sigma^2, \phi \sim \mathcal{N}(\boldsymbol{0}, \sigma^2 \, R(\phi))$

As a result:

[ \boldsymbol{y} \, | \, \boldsymbol{\beta}, \boldsymbol{u}, \sigma^2, \phi \sim \mathcal{N}(X \boldsymbol{\beta} + Z \boldsymbol{u}, \sigma^2\, R(\phi)) ]

Let us also introduce $\theta = (\sigma^2, \gamma, \phi)^T$. After integrating out the $\boldsymbol{u}$ (i.e., marginalizing them out):

[ \boldsymbol{y} \, | \, \boldsymbol{\beta}, \theta \sim \mathcal{N}(X \boldsymbol{\beta}, V(\theta)) ]

where: [ var(\boldsymbol{y}) = V(\theta) = \sigma^2 \, H \; \text{ with } \; H := Z G(\gamma) Z^T + R(\phi) ]

Linear-variance parametrization

In variance-component models, i.e., without covariances such as $R = \text{Id}_n$, it is not necessary to have $\sigma^2$ for overall scaling:

$var(\boldsymbol{\epsilon}) = \sigma^2 \text{Id}_n$
$var(\boldsymbol{u}) = G(\gamma)$ where $\gamma$ is the $b \times 1$ vector of variance components along the diagonal of $G$

Hence:

[ var(\boldsymbol{y}) = V(\theta) = H \; \text{ with } \; H := Z G Z^T + \sigma^2 \text{Id}_n ]

Likelihood

By definition, the likelihood noted $\mathcal{L}$ is the probability density of the data given the parameters but, most importantly, it is a function of the parameters:

[ \mathcal{L}(\boldsymbol{\beta}, \theta) = p(\boldsymbol{y} | \boldsymbol{\beta}, \theta) = (2 \pi)^{-\frac{n}{2}} \, |V(\theta)|^{-\frac{1}{2}} \, \exp \left( - \frac{1}{2} (\boldsymbol{y} - X \boldsymbol{\beta})^T V(\theta)^{-1} (\boldsymbol{y} - X \boldsymbol{\beta}) \right) ]

where $|M|$ denotes the determinant of matrix $M$.

In practice the log-likelihood is maximized as it is easier to compute and delivers the same results, the log being monotonous: [ l(\boldsymbol{\beta}, \theta) = \text{ln} \mathcal{L} = - \frac{1}{2} ( \, n \, \text{ln}(2 \pi) \, + \, \text{ln} |V(\theta)| \, + \, (\boldsymbol{y} - X \boldsymbol{\beta})^T V(\theta)^{-1} (\boldsymbol{y} - X \boldsymbol{\beta}) \, ) ]

This document skips over the estimation of $\theta$ and, considering $\theta$ to be known, it focuses on the estimation of $\boldsymbol{\beta}$ and the prediction of $\boldsymbol{u}$.

Estimation of $\boldsymbol{\beta}$ and prediction of $\boldsymbol{u}$

Maximizing $\mathcal{L}(\boldsymbol{\beta})$

Given that the expectation of $\boldsymbol{y}$ is $X \boldsymbol{\beta}$ and that its variance $V(\theta)$ is considered known here, one can maximize the likelihood to estimate $\boldsymbol{\beta}$, which gives:

[ \boldsymbol{\hat{\beta}} = (X^T V^{-1} X)^{-1} X^T V^{-1} \boldsymbol{y} ]

eventually using a generalized inverse if $X$ is not full rank, i.e., when $\nu < p$.

It is the equivalent of the generalized least squares (GLS).

Nevertheless, this formula requires the inverse of $V$ which is $n \times n$, hence the computational challenge.

Maximizing $f(\boldsymbol{y}, \boldsymbol{u} | \boldsymbol{\beta}, \theta)$

Assuming that $\theta$ is known, Henderson discovered that the best linear unbiased estimates (BLUEs) of $\boldsymbol{\beta}$ and the best linear unbiased predictions (BLUPs) of $\boldsymbol{u}$ can be obtained by maximizing the joint density of $\boldsymbol{y}$ and $\boldsymbol{u}$ (it is not really a likelihood as the random effects are not observed).

\begin{align} \text{ln} \, f(\boldsymbol{y}, \boldsymbol{u} | \boldsymbol{\beta}, \theta) &= \text{ln} \, f(\boldsymbol{y} | \boldsymbol{u}, \boldsymbol{\beta}, \sigma^2, \phi) + \text{ln} \, f(\boldsymbol{u} | \sigma^2, \gamma) \ &= - \frac{1}{2} \left[ \, n \text{ln}(2 \pi) + n \text{ln}(\sigma^2) + \text{ln} |R| + \frac{1}{\sigma^2} (\boldsymbol{y} - X \boldsymbol{\beta} - Z \boldsymbol{u})^T R^{-1} (\boldsymbol{y} - X \boldsymbol{\beta} - Z \boldsymbol{u}) \, \right] \ & \; - \frac{1}{2} \left[ \, q \text{ln}(2 \pi) + q \text{ln}(\sigma^2) + \text{ln} |G| + \frac{1}{\sigma^2} \boldsymbol{u}^T G^{-1} \boldsymbol{u} \right] \end{align}

knowing that $|c \, M| = c^n |M|$ if $M$ is $n \times n$.

It is useful to re-write this by separating the terms depending on $\boldsymbol{\beta}$ and those depending on $\boldsymbol{u}$:

\begin{align} \text{ln} \, f(\boldsymbol{y}, \boldsymbol{u} \, | \, \boldsymbol{\beta}, \theta) &= - \frac{1}{2 \sigma^2} \left[ \, (\boldsymbol{y} - X \boldsymbol{\beta})^T R^{-1} (\boldsymbol{y} - X \boldsymbol{\beta}) \, \right] \ &- \frac{1}{2 \sigma^2} \left[ \, \boldsymbol{u}^T (Z R^{-1} Z^T + G^{-1}) \boldsymbol{u} \, \right] \ &- \frac{1}{2 \sigma^2} \left[ \, - 2 (\boldsymbol{y} - X \boldsymbol{\beta})^T R^{-1} Z \boldsymbol{u} \, \right] \ &- \frac{1}{2} \left[ \, (n + q) \text{ln}(\sigma^2) + \text{ln} |R| + \text{ln} |G| + (n + q) \text{ln}(2 \pi) \right] \end{align}

First derivatives

Then, one computes the first partial derivatives (called the score function when the derived function is a likelihood, hence noted $s$): \begin{align} s(\boldsymbol{\beta}) &:= \frac{\partial \, \text{ln} f(\boldsymbol{y}, \boldsymbol{u} \, | \, \boldsymbol{\beta}, \theta)}{\partial \boldsymbol{\beta}}(\boldsymbol{\beta}) \ &= \frac{\partial }{\partial \boldsymbol{\beta}} \left( -\frac{1}{2 \sigma^2} [(\boldsymbol{y} - X \boldsymbol{\beta})^T R^{-1} (\boldsymbol{y} - X \boldsymbol{\beta}) - 2 (\boldsymbol{y} - X \boldsymbol{\beta})^T R^{-1} Z \boldsymbol{u}] \right) \ &= -\frac{1}{2 \sigma^2} [-2 X^T R^{-1} (\boldsymbol{y} - X \boldsymbol{\beta}) + 2 X^T R^{-1} Z \boldsymbol{u}] \ &= -\frac{1}{\sigma^2} \left[ \, X^T R^{-1} X \boldsymbol{\beta} + X^T R^{-1} Z \boldsymbol{u} - X^T R^{-1} \boldsymbol{y} \, \right] \end{align}

\begin{align} s(\boldsymbol{u}) &:= \frac{\partial \, \text{ln} f(\boldsymbol{y}, \boldsymbol{u} \, | \, \boldsymbol{\beta}, \theta)}{\partial \boldsymbol{u}}(\boldsymbol{u}) \ &= \frac{\partial }{\partial \boldsymbol{u}} \left( -\frac{1}{2 \sigma^2} [\boldsymbol{u}^T (Z R^{-1} Z^T + G^{-1}) \boldsymbol{u} - 2 (\boldsymbol{y} - X \boldsymbol{\beta})^T R^{-1} Z \boldsymbol{u}] \right) \ &= -\frac{1}{2 \sigma^2} [ \, 2 (Z R^{-1} Z^T + G^{-1}) \boldsymbol{u} - 2 (\boldsymbol{y} - X \boldsymbol{\beta})^T R^{-1} Z \, ] \ &= -\frac{1}{\sigma^2} [ \, Z^T R^{-1} X \boldsymbol{\beta} + (Z R^{-1} Z^T + G^{-1}) \boldsymbol{u} - Z^T R^{-1} \boldsymbol{y} \, ] \end{align}

MMEs

By equating these derivatives to zero, the system of equations known as the mixed model equations (MMEs) can be written in matrix form as:

[ \begin{bmatrix} X^T R^{-1} X & X^T R^{-1} Z \ Z^T R^{-1} X & Z^T R^{-1} Z + G^{-1} \end{bmatrix} \begin{bmatrix} \hat{\boldsymbol{\beta}} \ \hat{\boldsymbol{u}} \end{bmatrix} = \begin{bmatrix} X^T R^{-1} \boldsymbol{y} \ Z^T R^{-1} \boldsymbol{y} \end{bmatrix} ]

where the square matrix on the left-hand side, of dimensions $(p+q) \times (p+q)$, is named the coefficient matrix, noted $C$.

BLUEs and BLUPs

Developping the MMEs:

$X^T R^{-1} X \hat{\boldsymbol{\beta}} + X^T R^{-1} Z \hat{\boldsymbol{u}} = X^T R^{-1} \boldsymbol{y}$;
$(Z^T R^{-1} Z + G^{-1}) \hat{\boldsymbol{u}} = Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}})$.

Inserting in the first equation the value of $\hat{\boldsymbol{u}}$ from the second, we get the BLUE of $\boldsymbol{\beta}$, noted $\hat{\boldsymbol{\beta}}$:

\begin{align} X^T (R^{-1} - R^{-1} Z (Z^T R^{-1} Z + G^{-1})^{-1} Z^T R^{-1}) X \hat{\boldsymbol{\beta}} &= X^T (R^{-1} - R^{-1} Z (Z^T R^{-1} Z + G^{-1})^{-1} Z^T R^{-1}) \boldsymbol{y} \ X^T (Z^T G Z + R)^{-1} X \hat{\boldsymbol{\beta}} &= X^T (Z^T G Z + R)^{-1} \boldsymbol{y} \ X^T V^{-1} X \hat{\boldsymbol{\beta}} &= X^T V^{-1} \boldsymbol{y} \ \hat{\boldsymbol{\beta}} &= (X^T V^{-1} X)^- X^T V^{-1} \boldsymbol{y} \end{align}

Similarly, we get the BLUP of $\boldsymbol{u}$, noted $\hat{\boldsymbol{u}}$:

\begin{align} \hat{\boldsymbol{u}} &= (Z^T R^{-1} Z + G^{-1})^{-1} Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= G Z^T V^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \end{align}

In pratice, the unknown $\theta$ is replaced by its estimate $\hat{\theta}$: one hence speaks of "empirical" BLUEs and BLUPs, noted eBLUEs and eBLUPs.

The formulas above require the inverse of $V$, which is computationally heavy to obtain. Hopefully, simpifications can be deduced, based on the inverse of the coefficient matrix $C$.

Inverse of $C$

The coefficient matrix $C$ of the MMEs is a block matrix, i.e., it is partitioned into four submatrices $C_{11}$, $C_{12}$, $C_{21}$ and $C_{22}$:

[ C := \begin{bmatrix} C_{11} & C_{12} \ C_{21} & C_{22} \end{bmatrix} := \begin{bmatrix} X^T R^{-1} X & X^T R^{-1} Z \ Z^T R^{-1} X & Z^T R^{-1} Z + G^{-1} \end{bmatrix} ]

with $C_{21} = C_{12}^T$.

As such, it can be inverted blockwise:

[ C^{-1} := \begin{bmatrix} C^{11} & C^{12} \ C^{21} & C^{22} \end{bmatrix} ]

Technically, as it is easier to invert matrices that are lower or upper triangular, a LDU decomposition is first applied to $C$. Such a decomposition involves the Schur complement of its diagonal blocks, which are then inverted. Below, $C / C_{ii}$ denotes the Schur complement of block $C_{ii}$ of matrix $C$.

For the block in the top-left corner:

first: $C / C_{11} = C_{11} - C_{12} C_{22}^{-1} C_{21}$
then $C^{11} := (C / C_{11})^{-1}$

For the block in the bottom-right corner:

first $C / C_{22} = C_{22} - C_{21} C_{11}^{-1} C_{12}$
then $C^{22} := (C / C_{22})^{-1}$

Developping the first equation above gives:

\begin{align} C^{11} &= [ C_{11} - C_{12} C_{22}^{-1} C_{21} ]^{-1} \ &= [ X^T R^{-1} X - X^T R^{-1} Z (Z^T R^{-1} Z + G^{-1})^{-1} Z^T R^{-1} X ]^{-1} \ &= [ X^T ( R^{-1} - R^{-1} Z (Z^T R^{-1} Z + G^{-1})^{-1} Z^T R^{-1} ) X ]^{-1} \ &= (X^T V^{-1} X)^{-1} \end{align}

Developping the second equation above gives:

\begin{align} C^{22} &= [ C_{22} - C_{21} C_{11}^{-1} C_{12} ]^{-1} \ &= [ Z^T R^{-1} Z + G^{-1} - Z^T R^{-1} X (X^T R^{-1} X)^{-1} X^T R^{-1} Z ]^{-1} \ \end{align}

Moreover, McLean and Sanders (1988) showed the following:

$C^{12} = -G Z^T V^{-1} X C^{11}$
$C^{22} = (Z^T R^{-1} Z + G^{-1})^{-1} - C^{12} X^T V^{-1} Z G$

Simplification of the BLUP

It can now be shown that:

\begin{align} V \, (R^{-1} - R^{-1} Z C^{22} Z^T R^{-1}) &= (R + Z G Z^T) \, (R^{-1} - R^{-1} Z C^{22} Z^T R^{-1}) \ &= \text{Id} + Z G Z^T R^{-1} - Z C^{22} Z^T R^{-1} - Z G Z^T R^{-1} Z C^{22} Z^T R^{-1} \ &= \text{Id} + Z G Z^T R^{-1} - Z (\text{Id} + G Z^T R^{-1} Z) C^{22} Z^T R^{-1} \ &= \text{Id} + Z G Z^T R^{-1} - Z G (G^{-1} + Z^T R^{-1} Z) C^{22} Z^T R^{-1} \ &= \text{Id} + Z G Z^T R^{-1} - Z G C_{22} C^{22} Z^T R^{-1} \ &= \text{Id} \end{align}

Hence:

[ V^{-1} = R^{-1} - R^{-1} Z C^{22} Z^T R^{-1} ]

Usually, $R^{-1}$ has a structure allowing it to be easily computed. Moreover, $C^{22}$, i.e., the lower-right block matrix of the inverse of $C$, is $q \times q$ with $q \ll n$, making it easier to invert than $V$ which is $n \times n$.

As a result, the formula for the BLUP can be simplified:

\begin{align} \hat{\boldsymbol{u}} &= G Z^T V^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= G Z^T (R^{-1} - R^{-1} Z C^{22} Z^T R^{-1}) (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= G (Z^T R^{-1} - Z^T R^{-1} Z C^{22} Z^T R^{-1}) (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= G (\text{Id} - Z^T R^{-1} Z C^{22}) Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= G (C_{22} - Z^T R^{-1} Z) C^{22} Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= G (Z^T R^{-1} Z + G^{-1} - Z^T R^{-1} Z) C^{22} Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= (G Z^T R^{-1} Z + \text{Id} - G Z^T R^{-1} Z) C^{22} Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \ &= C^{22} Z^T R^{-1} (\boldsymbol{y} - X \hat{\boldsymbol{\beta}}) \end{align}

TODO: simplify also the BLUEs of $\beta$

Estimation and prediction variances

The following formulas were derived in appendix A of Henderson (1975), where he first defined $Q_1$ and $Q_2$ after inverting the MMEs:

$\begin{bmatrix} \hat{\boldsymbol{\beta}} \ \hat{\boldsymbol{u}} \end{bmatrix} = \begin{bmatrix} C^{11} & C^{12} \ C^{21} & C^{22} \end{bmatrix} \begin{bmatrix} X^T R^{-1} \boldsymbol{y} \ Z^T R^{-1} \boldsymbol{y} \end{bmatrix} = \begin{bmatrix} Q_1^T \boldsymbol{y} \ Q_2^T \boldsymbol{y} \end{bmatrix}$

so that:

$\hat{\boldsymbol{\beta}} = Q_1^T \boldsymbol{y}$
$\hat{\boldsymbol{u}} = Q_2^T \boldsymbol{y}$

and:

$Q_1^T := C^{11} X^T R^{-1} + C^{12} Z^T R^{-1}$
$Q_2^T := C^{21} X^T R^{-1} + C^{22} Z^T R^{-1}$

Useful identities

\begin{align} \begin{bmatrix} Q_1^T \ Q_2^T \end{bmatrix} \begin{bmatrix} X \, Z \end{bmatrix} &= \begin{bmatrix} C^{11} & C^{12} \ C^{21} & C^{22} \end{bmatrix} \begin{bmatrix} X^T R^{-1} X & Z^T R^{-1} Z \ Z^T R^{-1} X & Z^T R^{-1} Z \end{bmatrix} \ &= \begin{bmatrix} C^{11} & C^{12} \ C^{21} & C^{22} \end{bmatrix} \left( \begin{bmatrix} X^T R^{-1} X & Z^T R^{-1} Z \ Z^T R^{-1} X & Z^T R^{-1} Z + G^{-1} \end{bmatrix} - \begin{bmatrix} 0 & 0 \ 0 & G^{-1} \end{bmatrix} \right) \ &= ... TODO ... \ &= \begin{bmatrix} Id & 0 \ 0 & Id \end{bmatrix} - \begin{bmatrix} 0 & C^{12} G^{-1} \ 0 & C^{22} G^{-1} \end{bmatrix} \end{align}

so that:

$Q_1^T X = Id$
$Q_1^T Z = -C^{12} G^{-1}$
$Q_2^T X = 0$
$Q_2^T Z = Id - C^{22} G^{-1}$

$Var(\hat{\boldsymbol{\beta}})$

When $A$ is a constant matrix and $x$ a random vector, it is known that $Var(A x) = A Var(x) A^T$ .

Hence:

\begin{align} Var(\hat{\boldsymbol{\beta}}) &= Var((X^T V^{-1} X)^{-1} X^T V^{-1} \boldsymbol{y}) \ &= (X^T V^{-1} X)^{-1} X^T V^{-1} \; V \; V^{-1} X (X^T V^{-1} X)^{-1} \ &= (X^T V^{-1} X)^{-1} X^T V^{-1} X (X^T V^{-1} X)^{-1} \ &= (X^T V^{-1} X)^{-1} \ &= C^{11} \end{align}

$Var(\hat{\boldsymbol{u}})$

\begin{align} Var(\hat{\boldsymbol{u}}) &= Var(Q_2^T \boldsymbol{y}) \ &= Q_2^T (Z G Z^T + R) Q_2 \ &= Q_2^T Z G Z^T Q_2 + Q_2^T R Q_2 \ &= (Id - C^{22} G^{-1}) G (Id - C^{22} G^{-1})^T + Q_2^T R (R^{-1} X C^{12} + R^{-1} Z C^{22,T}) \ &= (Id - C^{22} G^{-1}) G (Id - G^{-1} C^{22}) + 0 + (Id - C^{22} G^{-1}) C^{22} \ &= (G - C^{22})(Id - G^{-1} C^{22}) + (Id - C^{22} G^{-1}) C^{22} \ &= (G - C^{22} - C^{22} + C^{22} G^{-1} C^{22}) + (C^{22} - C^{22} G^{-1} C^{22}) \ &= G - C^{22} \end{align}

$Cov(\hat{\boldsymbol{u}}, \boldsymbol{u})$

\begin{align} Cov(\hat{\boldsymbol{u}}, \boldsymbol{u}) &= Cov(Q_2^T \boldsymbol{y}, \boldsymbol{u}) \ &= Cov(Q_2^T Z \boldsymbol{u}, \boldsymbol{u}) \ &= Q_2^T Z G \ &= (Id - C^{22} G^{-1}) G \ &= G - C^{22} \ &= Var(\hat{\boldsymbol{u}}) \end{align}

$Var(\hat{\boldsymbol{u}} - \boldsymbol{u})$

\begin{align} Var(\hat{\boldsymbol{u}} - \boldsymbol{u}) &= Var(\hat{\boldsymbol{u}}) - 2 Cov(\hat{\boldsymbol{u}}, \boldsymbol{u}) + Var(\boldsymbol{u}) \ &= Var(\boldsymbol{u}) - Var(\hat{\boldsymbol{u}}) \ &= G - (G - C^{22}) \ &= C^{22} \end{align}

Compared to $Var(\hat{\boldsymbol{u}})$, $Var(\hat{\boldsymbol{u}} - \boldsymbol{u})$ takes into account the variance of $\boldsymbol{u}$.

The prediction error is the difference between the predictor, $\hat{\boldsymbol{u}}$, and the predictand, $\boldsymbol{u}$. Hence, $Var(\hat{\boldsymbol{u}} - \boldsymbol{u})$ is known as the prediction error variance (PEV), and its square root is known as the standard error of prediction (SEP).

Implementation

Version 1

Make the MME elements

makeMmeElements <- function(y, X, Z){
  if(is.vector(y))
    y <- matrix(y, nrow=length(y), ncol=1)
  stopifnot(is.matrix(y),
            is.matrix(X),
            is.matrix(Z),
            ncol(y) == 1,
            nrow(y) == nrow(X),
            nrow(X) == nrow(Z))

  ## for the left-hand side
  tX.X <- crossprod(X, X)
  tZ.X <- crossprod(Z, X)
  tX.Z <- crossprod(X, Z)
  tZ.Z <- crossprod(Z, Z)

  ## for the right-hand side
  tX.y <- crossprod(X, y)
  tZ.y <- crossprod(Z, y)

  return(list(tX.X=tX.X, tZ.X=tZ.X, tX.Z=tX.Z, tZ.Z=tZ.Z,
              tX.y=tX.y, tZ.y=tZ.y))
}

Make the MME right-hand side

makeMmeRhs <- function(tX.y, tZ.y){
  stopifnot(is.matrix(tX.y),
            ncol(tX.y) == 1,
            is.matrix(tZ.y),
            ncol(tZ.y) == 1)

  rhs <- c(tX.y, tZ.y)

  return(rhs)
}

Make the MME left-hand side

makeMmeLhs <- function(tX.X, tZ.X, tX.Z, tZ.Z, lambda, Ainv){
  stopifnot(is.matrix(tX.X),
            is.matrix(tZ.X),
            is.matrix(tX.Z),
            is.matrix(tZ.Z),
            nrow(tX.X) == nrow(tX.Z),
            nrow(tZ.Z) == nrow(tZ.X),
            is.numeric(lambda),
            length(lambda) == 1,
            is.matrix(Ainv))

  p <- nrow(tX.X)
  q <- nrow(tZ.Z)
  lhs <- matrix(data=NA, nrow=p+q, ncol=p+q)

  lhs[1:p, 1:p] <- tX.X
  lhs[(p+1):(p+q), 1:p] <- tZ.X
  lhs[1:p, (p+1):(p+q)] <- tX.Z
  lhs[(p+1):(p+q), (p+1):(p+q)] <- tZ.Z + lambda * Ainv

  return(lhs)
}

Solve

solveMme_v1 <- function(y, X, Z, sigma.u2, Ainv, sigma2){
  if(is.vector(y))
    y <- matrix(y, nrow=length(y), ncol=1)
  stopifnot(is.matrix(y),
            is.matrix(X),
            is.matrix(Z),
            is.matrix(Ainv),
            ncol(y) == 1,
            nrow(y) == nrow(X),
            nrow(X) == nrow(Z),
            nrow(Ainv) == ncol(Z))

  elems <- makeMmeElements(y=y, X=X, Z=Z)

  rhs <- makeMmeRhs(tX.y=elems$tX.y, tZ.y=elems$tZ.y)

  lhs <- makeMmeLhs(tX.X=elems$tX.X, tZ.X=elems$tZ.X,
                    tX.Z=elems$tX.Z, tZ.Z=elems$tZ.Z,
                    lambda=sigma2 / sigma.u2, Ainv=Ainv)

  theta.hat <- solve(lhs, rhs) # faster than solve(lhs) %*% rhs

  return(as.vector(theta.hat))
}

Version 2

solveMme_v2 <- function(y, X, Z, sigma.u2, Ginv, Rinv){
  if(is.vector(y))
    y <- matrix(y, nrow=length(y), ncol=1)
  stopifnot(is.matrix(y),
            is.matrix(X),
            is.matrix(Z),
            is.matrix(Ginv),
            is.matrix(Rinv),
            ncol(y) == 1,
            nrow(y) == nrow(X),
            nrow(X) == nrow(Z),
            nrow(Ginv) == ncol(Ginv),
            nrow(Ginv) == ncol(Z),
            nrow(Rinv) == ncol(Rinv),
            nrow(Rinv) == nrow(y))

  tX <- t(X)
  tZ <- t(Z)

  W <- solve(tZ %*% Rinv %*% Z + Ginv)

  if(FALSE){
    H <- Z %*% G %*% tZ + R
    V <- sigma2 * H
    Vinv <- solve(V)
  }
  Vinv <- Rinv - Rinv %*% Z %*% W %*% tZ %*% Rinv
  BLUE_beta <- solve(tX %*% Vinv %*% X, tX %*% Vinv %*% y)

  BLUP_u <- W %*% tZ %*% Rinv %*% (y - X %*% BLUE_beta)

  out <- c(BLUE_beta[,1], BLUP_u[,1])
  return(out)
}

Example

suppressWarnings(suppressPackageStartupMessages(library(rutilstimflutre)))

Simulate data

set.seed(12345)

SNP genotypes

nbGenos <- 300
if(FALSE){
  nbChroms <- 10
  Ne <- 10^4
  chromLen <- 10^5
  mutRate <- 10^(-8)
  recRate <- 10^(-8)
  nbPops <- 10
  migRate <- 50 # between 0.5 and 100
  genomes <- simulCoalescent(nb.inds=nbGenos,
                             nb.reps=nbChroms,
                             pop.mut.rate=4 * Ne * mutRate * chromLen,
                             pop.recomb.rate=4 * Ne * recRate * chromLen,
                             chrom.len=chromLen,
                             nb.pops=nbPops,
                             mig.rate=migRate)
  M <- genomes$genos
} else{
  nbSnps <- 2000
  nbPops <- 3
  divBtwPops <- diag(nbPops)
  divBtwPops[upper.tri(divBtwPops)] <- 0.95
  divBtwPops[lower.tri(divBtwPops)] <- divBtwPops[upper.tri(divBtwPops)]
  M <- simulGenosDoseStruct(nb.genos=rep(nbGenos / nbPops, nbPops),
                            nb.snps=nbSnps,
                            div.pops=divBtwPops)
}
dim(M)
M[1:3, 1:5]
nbSnps <- ncol(M)
levGenos <- rownames(M)

Allele frequencies

afs <- estimSnpAf(M)
plotHistAllelFreq(afs=afs)

mafs <- estimSnpMaf(M)
plotHistMinAllelFreq(mafs=mafs)

Genetic relationships

A_noia <- estimGenRel(M, afs, relationships="additive", method="noia")

hist(diag(A_noia), breaks="FD", main="Inbreeding coefficients")
abline(v=mean(diag(A_noia)), lwd=2)

hist(A_noia[upper.tri(A_noia)], main="Additive genetic relationships")
abline(v=mean(A_noia[upper.tri(A_noia)]), lwd=2)

imageWithScale(A_noia, main="Additive genetic relationships")

outPca <- pca(M)
plotPca(outPca$rot.dat, outPca$prop.vars)

Phenotypes

Each genotype is phenotyped a given number of times.

nbBlocks <- 4
levBlocks <- LETTERS[1:nbBlocks]
nbDat <- nbGenos * nbBlocks
dat <- data.frame(block=rep(levBlocks, each=nbGenos),
                  geno=levGenos,
                  stringsAsFactors=TRUE)
str(dat)
head(dat)

Design effects

X <- model.matrix(~ 1 + block, data=dat)
mu <- 50
effBlocks <- rnorm(n=nbBlocks - 1, mean=0, sd=10)
beta <- c(mu, effBlocks)
nbFixEffs <- length(beta)

SNP substitution effects

sigma_alpha2 <- 10^(-3)
alpha <- rnorm(n=ncol(M), mean=0, sd=sqrt(sigma_alpha2))
names(alpha) <- colnames(X)

Genotypic values

tmp <- matrix(rep(1, nbGenos)) %*% (2 * afs)
Z_u <- M - tmp
u <- Z_u %*% alpha
Z <- model.matrix(~ 0 + geno, data=dat)
## image(t(Z)[, nrow(Z):1])
nbRndEffs <- length(u)
(sigma_u2 <- sigma_alpha2 * 2 * sum(afs * (1 - afs)))

Errors

h2 <- 0.7
(sigma2 <- ((1 - h2) / h2) * sigma_u2)
epsilon <- rnorm(n=nbDat, mean=0, sd=sqrt(sigma2))

Phenotypic values

y <- X %*% beta + Z %*% u + epsilon

Compute BLUEs and BLUPs

(n <- nrow(y))
(p <- ncol(X))
(q <- ncol(Z))

Version 1

system.time(
    out <- solveMme_v1(y=y, X=X, Z=Z, sigma.u2=sigma_u2, Ainv=solve(A_noia), sigma2))

(checkFixEffs <- cbind(truth=beta,
                       BLUE=out[1:p]))
checkRndEffs <- cbind(truth=u[,1],
                      BLUP=out[(p+1):(p+q)])
cor(checkRndEffs[,"truth"], checkRndEffs[,"BLUP"])

Version 2

system.time(
    out <- solveMme_v2(y=y, X=X, Z=Z, sigma.u2=sigma_u2, Ginv=solve(A_noia), Rinv=diag(nrow(y))))
(checkFixEffs <- cbind(truth=beta,
                       BLUE=out[1:p]))
checkRndEffs <- cbind(truth=u[,1],
                      BLUP=out[(p+1):(p+q)])
cor(checkRndEffs[,"truth"], checkRndEffs[,"BLUP"])

References

Meyer K. 1989. Restricted maximum likelihood to estimate variance components for animal models with several random effects using a derivative-free algorithm. Genet Sel Evol. 21(3):317. doi:10.1186/1297-9686-21-3-317.
Gilmour A, Thompson R, Cullis B. 1995. Average Information REML: An Efficient Algorithm for Variance Parameter Estimation in Linear Mixed Models. Biometrics. 51(4):1440–1450. doi:10.2307/2533274.
Gumedze FN, Dunne TT. 2011. Parameter estimation and inference in the linear mixed model. Linear Algebra and its Applications. 435(8):1920–1944. doi:10.1016/j.laa.2011.04.015.
Mrode R, Thompson R. 2005. Linear models for the prediction of animal breeding values. CABI Pub. http://dx.doi.org/10.1079/9780851990002.0000.
Zhu S, Wathen AJ. 2018 May 10. Essential formulae for restricted maximum likelihood and its derivatives associated with the linear mixed models. arXiv:180505188 [stat]. http://arxiv.org/abs/1805.05188.

timflutre/rutilstimflutre documentation built on Feb. 12, 2025, 11:35 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

timflutre/rutilstimflutre Timothee Flutre's personal R code

In timflutre/rutilstimflutre: Timothee Flutre's personal R code

Linear mixed model

Variance-ratio parametrization

Linear-variance parametrization

Likelihood

Estimation of $\boldsymbol{\beta}$ and prediction of $\boldsymbol{u}$

Maximizing $\mathcal{L}(\boldsymbol{\beta})$

Maximizing $f(\boldsymbol{y}, \boldsymbol{u} | \boldsymbol{\beta}, \theta)$

First derivatives

MMEs

BLUEs and BLUPs

Inverse of $C$

Simplification of the BLUP

Estimation and prediction variances

Useful identities

$Var(\hat{\boldsymbol{\beta}})$

$Var(\hat{\boldsymbol{u}})$

$Cov(\hat{\boldsymbol{u}}, \boldsymbol{u})$

$Var(\hat{\boldsymbol{u}} - \boldsymbol{u})$

Implementation

Version 1

Make the MME elements

Make the MME right-hand side

Make the MME left-hand side

Solve

Version 2

Example

Simulate data

SNP genotypes

Allele frequencies

Genetic relationships

Phenotypes

Design effects

SNP substitution effects

Genotypic values

Errors

Phenotypic values

Compute BLUEs and BLUPs

Version 1

Version 2

References

R Package Documentation

Browse R Packages

We want your feedback!

timflutre/rutilstimflutre
Timothee Flutre's personal R code