The pbsim
package includes a suite of functions for simulating plant breeding programs. The motivation behind this package was to develop a set of tools that would make it simple to simulate plent breeding programs in order to test various methods or procedures. The package is on the simpler side of population genetics simulation tools, however it is nonetheless useful for quick experimentation.
The first step of using the package is to load it using the library()
function:
library(pbsim)
The first step is to make a geneome object. This is essentially a list
of class genome
. The object contains information such as number and length ofchromosomes, marker positions, and genetic architecture.
Simulate a genome with three chromosomes, each with 505 SNPs.
n.mar <- rep(505, 3) len <- c(120, 130, 140) genome <- sim_genome(len = len, n.mar = n.mar)
The output is a genome
object. A generic summary
function can be used to summarize the genome
object.
summary(genome)
Genomes can also be simulated to
The genetic model of a quantitative trait can be specified using the sim_gen_model()
function. The function is fairly flexible, allowing for custom QTL models including genome location, additive effects, and dominance effects. Alternatively, the location and effects of QTL can be drawn from random distributions.
For instance, you can customize the QTL model by providing a matrix of that information.
chromosome <- c(1, 1, 2, 2, 3, 3) pos <- as.numeric(sapply(X = genome$len, FUN = runif, n = 2, min = 0)) a <- c(1, 0.25, 0.5, 0.25, 0.25, 0.5) d <- 0 qtl_model <- cbind(chromosome, pos, a, d) genome <- sim_gen_model(genome, qtl_model)
The returned value from sim_gen_model()
is a genome
object with information on the genetic model. You can used the same summary()
function to view a summary of the genetic model
summary(genome)
The genetic model can be randomly generated by passing an empty matrix. Here we simulate $L = 15$ QTL.
We pass the add.dist = "geometric"
argument to generate additive allelic effects of QTL from a geometric series. As suggested by @Lande1990, the additive effect of QTL $k$ in $k = 1, 2, ..., L$ is $a^k$, where $a = \frac{1 - L}{1 + L}$.
When generating QTL randomly, we have the option of drawing SNPs in the genome to become QTL or generating new SNPs to become QTL. This is determined by the de.novo
argument.
A Note on the
de.novo
argument: Ifde.novo = FALSE
, an additional argument (max.qtl
) must also be passed. Often in simulations, one is interested in different genetic architectures (i.e. different numbers of QTL). When using SNPs already in the genome to simulate QTL, this will result in different numbers of markers leftover. For example, if you are testing $L = 50$ QTL and $L = 100$ QTL, drawing from SNPs already in the genome, with $m = 500$ SNPs total, you will have $m - L = 450$ markers leftover in the first case and $m - L = 400$ markers leftover in the second case. If your downstream simulation involves, say, QTL mapping or genomic prediction, this uneven number of markers could confound the results. To avoid this, themax.qtl
arguement sets $max.qtl - L$ QTL to have zero additive or dominance effects, resulting in the same number of markers leftover.
# Simulate 15 QTL drawn from SNPs already in the genome n_qtl <- 15 # Empty matrix qtl_model <- matrix(nrow = n_qtl, ncol = 4) # Set a total of max.qtl = 20 QTL in the genome genome <- sim_gen_model(genome = genome, qtl.model = qtl_model, add.dist = "geometric", de.novo = FALSE, max.qtl = 20) summary(genome) # Simulate 15 QTL that are newly generated in the genome genome <- sim_gen_model(genome = genome, qtl.model = qtl_model, add.dist = "geometric", de.novo = TRUE) summary(genome)
Notice that the number of markers is reduced in the first case, but not in the second.
Alternatively, the argument add.dist = "normal"
can be passed to draw additive effects from a normal distribution.
By default, dominance is assumed absent. However, dominance effects can be specified in the same way that additive effects were specified above. Additionally dominance effects can be generated from a normal distribution by passing the dom.dist = "normal"
argument
n_qtl <- 15 # Empty matrix qtl_model <- matrix(nrow = n_qtl, ncol = 4) genome <- sim_gen_model(genome = genome, qtl.model = qtl_model, add.dist = "geometric", dom.dist = "normal", de.novo = TRUE)
The genetic architecture of multiple traits can be simulated by
ped <- sim_pedigree(n.ind = 30, n.bcgen = 0, n.selfgen = 2, n.par = 2)
# First generate founders for a bi-parental population founder_geno <- sim_founders(genome) fam <- sim_family(genome, ped, founder_geno)
# First create a crossing block
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.