knitr::opts_chunk$set(echo = FALSE, results = "asis")
knitr::knit_hooks$set(hook_convert_odg = rmdhelp::hook_convert_odg)

Why

Sire Model

\begin{equation} y = X\beta + Z_ss + e \notag \end{equation}

with $var(e) = R$, $var(s) = A_s \sigma_s^2$ and $var(y) = Z_sA_sZ_s^T \sigma_s^2 + R$

$\rightarrow$ estimate $\sigma_s^2$ and $\sigma_e^2$ from data

Analysis of Variance (ANOVA)

\tiny

\begin{tabular}{lll} \hline \ Source & Degrees of Freedom ($df$) & Sums of Squares ($SSQ$) \ \hline \ Overall ($\mu$) & $Rank(X)=1$ & $y^TX(X^TX)^{-1}X^Ty = F$ \ Sires ($s$) & $Rank(Z_s) - Rank(X) = q - 1$ & $y^TZ_s(Z_s^TZ_s)^{-1}Z_s^Ty - y^TX(X^TX)^{-1}X^Ty = S$ \ Residual ($e$) & $n - Rank(Z_s) = n - q$ & $y^Ty - y^TZ_s(Z_s^TZ_s)^{-1}Z_s^Ty = R$ \ \hline \ Total & $n$ & $y^Ty$ \ \hline \end{tabular} \normalsize

Sums of Squares

$$F = y^TX(X^TX)^{-1}X^Ty = {1\over n} \left[\sum_{i=1}^n y_i \right]^2$$

$$S= y^TZ_s(Z_s^TZ_s)^{-1}Z_s^Ty - y^TX(X^TX)^{-1}X^Ty = \sum_{i=1}^{q} {1 \over n_i} \left[\sum_{j=1}^{n_i} y_{ij}\right]^2 - F $$

$$R = y^Ty - y^TZ_s(Z_s^TZ_s)^{-1}Z_s^Ty = \sum_{i=1}^n y_i^2 - S - F$$

Estimates

$$E(R) = (n-q) \sigma_e^2$$

$$E(S) = (q-1) \sigma_e^2 + tr(Z_sMZ_s)\sigma_s^2$$ where $M = I - X(X^TX)^{-1}X^T$ and $q$ is the number of sires.

$\rightarrow$ $\widehat{\sigma_e^2} = \frac{R}{n-q}$ and $\widehat{\sigma_s^2} = \frac{S - (q-1)\widehat{\sigma_e^2}}{tr(Z_sMZ_s)}$

Numerical Example

tbl_num_ex_chp12 <- tibble::tibble( Animal = c(4, 5, 6, 7),
                                        Sire   = c(2, 1, 3, 2),
                                        WWG    =  c(2.9, 4.0, 3.5, 3.5) )

knitr::kable(tbl_num_ex_chp12,
             booktabs  = TRUE,
             longtable = TRUE,
             caption   = "Small Example Dataset for Variance Components Estimation Using a Sire Model")

$$y_{ij} = \mu + s_j + e_i$$

Design Matrices

n_nr_obs_p02 <- nrow(tbl_num_ex_chp12)
### # design matrix X
mat_x_p02 <- matrix(1, nrow = n_nr_obs_p02, ncol = 1)
### # design matrix Z
mat_z_p02 <- matrix(c(0, 1, 0,
                      1, 0, 0,
                      0, 0, 1,
                      0, 1, 0), nrow = n_nr_obs_p02, byrow = TRUE)
n_nr_sire <- ncol(mat_z_p02)
### # Observations
mat_obs <- matrix(tbl_num_ex_chp12$WWG, ncol = 1)

cat("$$\n")
cat(paste(rmdhelp::bmatrix(pmat = mat_x_p02, ps_name = 'X'), collapse = '\n'), '\n')
cat("\\text{, }")
cat(paste(rmdhelp::bmatrix(pmat = mat_z_p02, ps_name = 'Z_s'), collapse = '\n'), '\n')
cat("$$\n")

ANOVA

### # compute F
ytx <- crossprod(mat_obs,mat_x_p02)
xtx <- crossprod(mat_x_p02)
ssqf <- ytx %*% solve(xtx) %*% t(ytx)
### # compute S
ytz <- crossprod(mat_obs, mat_z_p02)
ztz <- crossprod(mat_z_p02)
ssqs <- ytz %*% solve(ztz) %*% t(ytz) - ssqf
### # compute R
yty <- crossprod(mat_obs)
ssqr <- yty - ssqs - ssqf

An analysis of variance can be constructed as

\begin{center} \begin{tabular}{lll} \hline \ Source & Degrees of Freedom ($df$) & Sums of Squares ($SSQ$) \ \hline \ Overall ($\mu$) & $Rank(X)=1$ & $F = r ssqf$ \ Sires ($s$) & $Rank(Z_s) - Rank(X) = q - 1$ & $S = r ssqs$ \ Residual ($e$) & $n - Rank(Z_s) = n - q$ & $R = r ssqr$ \ \hline \ \end{tabular} \end{center}

Estimates

mat_m <- diag(n_nr_obs_p02) - mat_x_p02 %*% solve(xtx) %*% t(mat_x_p02)
ztmz <- t(mat_z_p02) %*% mat_m %*% mat_z_p02
tr_ztmz <- sum(diag(ztmz))
cat(paste(rmdhelp::bmatrix(pmat = mat_m, ps_name = 'M', ps_env = '$$'), collapse = '\n'), '\n')
cat(paste(rmdhelp::bmatrix(pmat = ztmz, ps_name = 'Z_s^TMZ_s', ps_env = '$$'), collapse = '\n'), '\n')

Results

hat_sigmae2 <- ssqr
hat_sigmas2 <- (ssqs - (n_nr_sire-1) * hat_sigmae2) / tr_ztmz

$$\hat{\sigma_e^2} = R = r hat_sigmae2$$ $$\hat{\sigma_s^2} = \frac{S - (q-1)\hat{\sigma_e^2}}{tr(Z_s^TMZ_s)} = \frac{r ssqs - r n_nr_sire-1 * r hat_sigmae2}{r tr_ztmz} = r hat_sigmas2$$

Anova in R

tbl_num_ex_chp12$Sire <- as.factor(tbl_num_ex_chp12$Sire)
aov_result <- aov(WWG ~ Sire, data = tbl_num_ex_chp12)
summary(aov_result)

Likelihood

$$L(\mu, \Sigma) = f(y | \mu, \Sigma)$$

with

$$ f_Y(y| \mu, \Sigma) = \frac{1}{\sqrt{(2\pi)^n det(\Sigma)}} exp \left{-{1\over 2}(y - \mu)^T \Sigma^{-1}(y - \mu) \right} $$

Maximum Likelihood

$$\hat{\Sigma} = argmax_{\Sigma} L(\mu, \Sigma)$$

Bayesian Approach

\begin{align} f(\Sigma | y) & = \frac{f(\Sigma, y)}{f(y)} \notag \ & = \frac{f(y | \Sigma)f(\Sigma)}{f(y)} \notag \ & \propto f(y | \Sigma)f(\Sigma) \notag \end{align}

where $f(\Sigma)$: prior distribution and $f(y|\Sigma)$: likelihood

Bayesian Estimates

$$\widehat{\sigma_s^2}{Bayes} = {1 \over N}\sum{t=1}^N (\sigma_s^2)^{(t)}$$



charlotte-ngs/lbgfs2020 documentation built on Dec. 20, 2020, 5:39 p.m.