knitr::opts_chunk$set(echo = FALSE, results = 'asis')
nHomDiff <- 15 p_orig <- p <- 0.08 p_incr <- 1.2 * p q <- 1-p a <- nHomDiff/2 d <- -1.5 alpha <- a + (q-p)*d alpha_orig <- alpha
We are considering a quantitative trait that depends on a given bi-allelic locus $G$. The frequency of the favorable allele corresponds to $r p
$. Suppose that genotype frequencies follow the Hardy-Weinberg equilibrium. The difference between the homozygous genotypes corresponds to $r nHomDiff
$. The heterozygous genotype has a value of $r d
$.
a) Compute the breeding values and the dominance deviations for the three genotypes.
b) Because of selecting the positive allele the frequency has increased to r p_incr
. How does this increased allele frequency change the breeding values?
Hint: Have a look at the summary table of all values in the course notes.
According to the summary table of all values in the course notes, the breeding values depend on the term $\alpha$. Therefore, we start by computing $\alpha$ first.
$$\alpha = a + (q-p)d$$
Based on the problem description, we know that $a = r a
$, $p = r p
$, $q = 1-p = r q
$ and $d = r d
$. Therefore
$$\alpha = r a
+ (r q
- r p
)*(r d
) = r alpha
$$
a) The summary table of all values then looks as follows.
tbl_genotypic_decomp <- tibble::tibble(Genotype = c("$G_1G_1$", "$G_1G_2$", "$G_2G_2$"), `Genotypic Value` = c(paste0("$a = ", a, "$"), paste0("$d = ", d, "$"), paste0("$-a = ", -a, "$")), `Breeding Value` = c(paste0("$2q\\alpha = ", 2*q*alpha, "$"), paste0("$(q-p)\\alpha = ", (q-p) * alpha, "$"), paste0("$-2p\\alpha = ", -2*p*alpha, "$")), `Dominance Deviation`= c(paste0("$-2q^2d = ", -2*q^2*d, "$"), paste0("$2pqd = ", 2*p*q*d, "$"), paste0("$-2p^2d = ", -2*p^2*d, "$"))) knitr::kable(tbl_genotypic_decomp, booktabs = TRUE, longtable = TRUE)
p <- p_incr q <- 1-p alpha <- a + (q-p)*d alpha_incr <- alpha
b) Based on the change in the allele frequency to $p = r p
$ and $q = r q
$. The value of $\alpha$ changes to $\alpha = r alpha
$. This has consequences for the whole summary table.
tbl_genotypic_decomp <- tibble::tibble(Genotype = c("$G_1G_1$", "$G_1G_2$", "$G_2G_2$"), `Genotypic Value` = c(paste0("$a = ", a, "$"), paste0("$d = ", d, "$"), paste0("$-a = ", -a, "$")), `Breeding Value` = c(paste0("$2q\\alpha = ", 2*q*alpha, "$"), paste0("$(q-p)\\alpha = ", (q-p) * alpha, "$"), paste0("$-2p\\alpha = ", -2*p*alpha, "$")), `Dominance Deviation`= c(paste0("$-2q^2d = ", -2*q^2*d, "$"), paste0("$2pqd = ", 2*p*q*d, "$"), paste0("$-2p^2d = ", -2*p^2*d, "$"))) knitr::kable(tbl_genotypic_decomp, booktabs = TRUE, longtable = TRUE)
Due to the increment in the allele frequency $p$ from r p_orig
to r p_incr
the value of $\alpha$ got bigger. But the breeding values decreased, because the negative influence of incrementing $p$ on the breeding values was bigger than the positive change of $\alpha$.
What is the meaning of the term allele substituion an how big is it in 1a) and 1b)?
The effect of allele substitution occurs in the difference of the breeding values between two genotypes where on of these genotypes has one favorable allele more than the other. For a single bi-allelic locus there are two possible differences that fullfill the requirement from the previous sentence, namely $BV_{12} - BV{22}$ and $BV_{11} - BV_{12}$. The result of both differences is the same and corresponds to $\alpha = a + (p-q)*d$.
The allele substitution ($\alpha$) in 1a) corresponds to r alpha_orig
in 1b) the value is r alpha_incr
.
siris_url <- "https://charlotte-ngs.github.io/lbgfs2020/misc/iris_ex03.csv"
You can download a file in csv-format from the course website. The URL is r siris_url
. Read the data from that csv-file into R using the function read.csv2()
. Test the consequences of specifying the option stringsAsFactors=TRUE
. The function read_csv2()
from the readr
package is an alterative way to import data from a .csv-file. The result is a little different. While the function read.csv2()
returns an ordinary 'data.frame' as a result, the function read_csv2()
returns a 'tibble' which is a more modern version of a 'data.frame'.
Hints:
?read.csv2
at the R-console.read.csv2()
to a variablestr()
on the result of read.csv2()
to see the difference between the two results of reading the data.read_csv2()
to import the data and inspect the difference between a 'data.frame' and a 'tibble'.bOnline <- FALSE sDataFn <- "iris_ex03.csv" if (!bOnline & !file.exists(sDataFn)) write.csv2(iris, file = sDataFn, row.names = FALSE)
dfIris1 <- read.csv2(file = "https://charlotte-ngs.github.io/lbgfs2020/misc/iris_ex03.csv") str(dfIris1) dfIris2 <- read.csv2(file = "https://charlotte-ngs.github.io/lbgfs2020/misc/iris_ex03.csv", stringsAsFactors = TRUE) str(dfIris2)
dfIris1 <- read.csv2(file = "iris_ex03.csv") str(dfIris1) dfIris2 <- read.csv2(file = "iris_ex03.csv", stringsAsFactors = TRUE) str(dfIris2)
s_iris_file <- "iris_ex03.csv" if (bOnline) s_iris_file <- "https://charlotte-ngs.github.io/lbgfs2020/misc/iris_ex03.csv" tblIris1 <- readr::read_csv2(file = s_iris_file) str(tblIris1)
Plot the values in the columns Sepal.Length
and Petal.Length
of the Iris data set. The plot should look like the following figure.
bIsSolution <- TRUE if(!bIsSolution) plot(dfIris2$Sepal.Length, dfIris2$Petal.Length)
The above plot was produced using the standard plotting function of the base-R system. The R-package ggplot2
provides an intersting alternative to the basic plotting function. A plot with ggplot2
looks as follows.
bIsSolution <- TRUE if(!bIsSolution) ggplot2::qplot(`Sepal.Length`, `Petal.Length`, data = dfIris2)
The plot with the base-R plotting function is produced with the following command.
plot(dfIris2$Sepal.Length, dfIris2$Petal.Length)
\pagebreak
The plot using ggplot2
functionality is created with the following statement.
ggplot2::qplot(`Sepal.Length`, `Petal.Length`, data = tblIris1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.