Home

/

GitHub

/

charlotte-ngs/lbgfs2021

/

In charlotte-ngs/lbgfs2021: Livestock Breeding and Genomics (ETHZ) - Fall Semester 2021

knitr::opts_chunk$set(echo = TRUE, results = 'markup')

In single-step genomic evaluations using breeding-value based models, the genomic relationship matrix is required as one part of the mixed model equations. In this notebook, we use the data set from exercise 12 to show how the genomic relationship matrix is computed.

s_data_path <- "https://charlotte-ngs.github.io/lbgfs2021/data/geno_data.csv"
tbl_geno_data <- readr::read_csv(file = s_data_path)
tbl_geno_data

The genomic relationship matrix can be computed as shown on the slides

How To Compute $G$

Read matrix $W$
For each column $j$ of $W$ compute frequency $p_j$
Compute matrix $S$ and $\sum_{j=1}^k(1-2p_j(1-p_j))$ from $p_j$
Compute $U$ from $W$ and $S$
Compute $G$

Step 1: Matrix $W$

Here we obtain matrix $W$ from the data

W <- as.matrix(tbl_geno_data[,c(2,3)])
W

Step 2: Frequency

To compute te frequency of the positive allele it is better to change the genotype encoding from $-1$, $0$, $1$ to $0$, $1$ , $2$. This is done by adding $1$ to all entries of $W$.

W <- W + 1
W

The frequency of the positive alleles can be computed as the column mean divided by $2$. Hence

freq <- apply(W, 2, mean) / 2
freq

The matrix $S$ is computed as

S <- matrix(2*freq - 1, nrow = nrow(W), ncol = ncol(W), byrow = TRUE)
S

The sum over the crossproducts is computed as

sumpq <- 2 * sum(freq * (1-freq))
sumpq

The genomic relationsip matrix is the

U = W - S - 1
U

G <- tcrossprod(U) / sumpq
G

The same computation can also be done using the following function

computeMatGrm <- function(pmatData) {
  matData <- pmatData
  # check the coding, if matData is -1, 0, 1 coded, then add 1 to get to 0, 1, 2 coding
  if (min(matData) < 0) matData <- matData + 1
  # Allele frequencies, column vector of P and sum of frequency products
  freq <- apply(matData, 2, mean) / 2
  P <- 2 * (freq - 0.5)
  sumpq <- sum(freq*(1-freq))
  # Changing the coding from (0,1,2) to (-1,0,1) and subtract matrix P
  Z <- matData - 1 - matrix(P, nrow = nrow(matData), 
                             ncol = ncol(matData), 
                             byrow = TRUE)
  # Z%*%Zt is replaced by tcrossprod(Z)
  return(tcrossprod(Z)/(2*sumpq))
}

Once the function is defined, the matrix $G$ can be computed as

G <- computeMatGrm(pmatData = as.matrix(tbl_geno_data[,c(2,3)]))
G

In most cases the computed genomic relationship matrix is singular and can therefore not be inverted. The solution to that is to add $0.1$ times the numeratore relationship matrix $A$ to the matrix $G$. If there is no information about the pedigree available as in this dataset, then we can set the numerator relationship matrix equal to the identity matrix.

# the following results in an error
solve(G)

Gstar <- G + 0.1 * diag(1, nrow = nrow(G))
Gstar

solve(Gstar)

charlotte-ngs/lbgfs2021 documentation built on Dec. 19, 2021, 3:01 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

charlotte-ngs/lbgfs2021
Livestock Breeding and Genomics (ETHZ) - Fall Semester 2021

In charlotte-ngs/lbgfs2021: Livestock Breeding and Genomics (ETHZ) - Fall Semester 2021

How To Compute $G$

Step 1: Matrix $W$

Step 2: Frequency

R Package Documentation

Browse R Packages

We want your feedback!

charlotte-ngs/lbgfs2021 Livestock Breeding and Genomics (ETHZ) - Fall Semester 2021

In charlotte-ngs/lbgfs2021: Livestock Breeding and Genomics (ETHZ) - Fall Semester 2021

How To Compute $G$

Step 1: Matrix $W$

Step 2: Frequency

R Package Documentation

Browse R Packages

We want your feedback!

charlotte-ngs/lbgfs2021
Livestock Breeding and Genomics (ETHZ) - Fall Semester 2021