knitr::opts_chunk$set(echo = TRUE, results = 'markup')
In single-step genomic evaluations using breeding-value based models, the genomic relationship matrix is required as one part of the mixed model equations. In this notebook, we use the data set from exercise 12 to show how the genomic relationship matrix is computed.
s_data_path <- "https://charlotte-ngs.github.io/lbgfs2021/data/geno_data.csv" tbl_geno_data <- readr::read_csv(file = s_data_path) tbl_geno_data
The genomic relationship matrix can be computed as shown on the slides
Here we obtain matrix $W$ from the data
W <- as.matrix(tbl_geno_data[,c(2,3)]) W
To compute te frequency of the positive allele it is better to change the genotype encoding from $-1$, $0$, $1$ to $0$, $1$ , $2$. This is done by adding $1$ to all entries of $W$.
W <- W + 1 W
The frequency of the positive alleles can be computed as the column mean divided by $2$. Hence
freq <- apply(W, 2, mean) / 2 freq
The matrix $S$ is computed as
S <- matrix(2*freq - 1, nrow = nrow(W), ncol = ncol(W), byrow = TRUE) S
The sum over the crossproducts is computed as
sumpq <- 2 * sum(freq * (1-freq)) sumpq
The genomic relationsip matrix is the
U = W - S - 1 U
G <- tcrossprod(U) / sumpq G
The same computation can also be done using the following function
computeMatGrm <- function(pmatData) { matData <- pmatData # check the coding, if matData is -1, 0, 1 coded, then add 1 to get to 0, 1, 2 coding if (min(matData) < 0) matData <- matData + 1 # Allele frequencies, column vector of P and sum of frequency products freq <- apply(matData, 2, mean) / 2 P <- 2 * (freq - 0.5) sumpq <- sum(freq*(1-freq)) # Changing the coding from (0,1,2) to (-1,0,1) and subtract matrix P Z <- matData - 1 - matrix(P, nrow = nrow(matData), ncol = ncol(matData), byrow = TRUE) # Z%*%Zt is replaced by tcrossprod(Z) return(tcrossprod(Z)/(2*sumpq)) }
Once the function is defined, the matrix $G$ can be computed as
G <- computeMatGrm(pmatData = as.matrix(tbl_geno_data[,c(2,3)])) G
In most cases the computed genomic relationship matrix is singular and can therefore not be inverted. The solution to that is to add $0.1$ times the numeratore relationship matrix $A$ to the matrix $G$. If there is no information about the pedigree available as in this dataset, then we can set the numerator relationship matrix equal to the identity matrix.
# the following results in an error solve(G)
Gstar <- G + 0.1 * diag(1, nrow = nrow(G)) Gstar
solve(Gstar)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.