Nothing
#' Genetic group pedigree and data simulation
#'
#' Simulates a pedigree and phenotype for a focal population receiving
#' immigrants. Genetic and environmental differences can be specified between
#' the focal and immigrant populations. Further, these differences can have
#' temporal trends.
#'
#' Offspring total additive genetic values \code{u} are the average of their
#' parents \code{u} plus a Mendelian sampling deviation drawn from a normal
#' distribution with mean of 0 and variance equal to \eqn{0.5V_{A} (1 -
#' f_{sd})}{0.5V[A] (1 - f[sd])} where \eqn{V_A}{V[A]} is \code{VAf} and
#' \eqn{f_{sd}}{f[sd]} is the average of the parents' coefficient of inbreeding
#' \emph{f} (p. 447 Verrier et al. 1993). Each \sQuote{immigrant} (individual
#' with unknown parents in generations >1) is given a total additive genetic
#' effect that is drawn from a normal distribution with mean of \code{mui} and
#' variance equal to \code{VAi}. Residual deviations are sampled for
#' \sQuote{focal} and \sQuote{immigrant} populations separately, using normal
#' distributions with means of \code{murf} and \code{muri}, respectively, and
#' variances of \code{VRf} and \code{VRi}, respectively. Phenotypes are the sum
#' of total additive genetic effects and residual deviations plus an overall
#' mean \code{mup}.
#'
#' Trends in total additive genetic effects and/or residual deviations can be
#' specified for both the focal and immigrant populations. Trends in total
#' additive genetic effects occurring in the immigrants, in the residual
#' deviations occurring in the focal population, and in the residual deviations
#' occurring in the immigrants are produced by altering the mean each
#' generation for the separate distribution from which these effects are each
#' drawn. The change in mean over a generation is specified in units of
#' standard deviations of the respective distributions (e.g., square roots of
#' \code{VAi}, \code{VRf}, and \code{VRi}) and is set with \code{d_bvi},
#' \code{d_rf}, or \code{d_ri}, respectively.
#'
#' Trends in total additive genetic effects for the focal population are
#' produced by selecting individuals to be parents of the next generation
#' according to their \emph{predicted} total additive genetic effects.
#' Individuals are assigned probabilities of being selected as a parent of the
#' next generation depending on how closely their predicted total additive
#' genetic effect matches an optimum value. Probabilities are assigned:
#' \deqn{exp((\frac{-1}{2\sigma_{x}}) (x - \theta)^{2})}{exp((-1/(2 * sd(x))) *
#' (x - theta)^2)} where \code{x} is the vector of predicted total additive
#' genetic effects (\code{u}), \eqn{\sigma_{x}}{sd(x)} is the standard
#' deviation of \code{x}, and \eqn{\theta}{theta} is the optimum value.
#' Sampling is conducted with replacement based on these probabilities.
#'
#' The parameter \code{d_bvf} specifies how much the optimal total additive
#' genetic effect changes per generation. The optimal total additive genetic
#' effect in a given generation is calculated as: \code{muf + d_bvf
#' *}\code{sqrt(VAf) * (i-2)}. Individuals with predicted total additive
#' genetic effects closest to this optimum have a higher probability of being
#' randomly sampled to be parents of the next generation. This represents
#' selection directly on predicted total additive genetic effects.
#'
#' Total additive genetic effects are predicted for the first generation of
#' focal individuals and all immigrants using equation 1.3 in Mrode (2005,
#' p.3): \eqn{h^{2} * (phenotype_{i} - mean population phenotype)}. The
#' heritability is either \code{VAf} / (\code{VAf + VRf}) or \code{VAi} /
#' (\code{VAi + VRi}). Total additive genetic effects are predicted for all
#' other individuals using equation 1.9 in Mrode (2005, p. 10) - or as the
#' average of each individual's parents' predicted total additive genetic
#' effects.
#'
#' @param K Integer number of individuals per generation, or the focal
#' population carrying capacity
#' @param pairs Integer number of mating pairs created by sampling with
#' replacement from adults of a given generation
#' @param noff Integer number of offspring each pair contributes to the next
#' generation
#' @param g Integer number of (non-overlapping) generations to simulate
#' @param nimm Integer number of immigrants added to the population each
#' generation of migration
#' @param nimmG Sequence of integers for the generations in which immigrants
#' arrive in the focal population
#' @param VAf Numeric value for the expected additive genetic variance in the
#' first generation of the focal population - the founders
#' @param VAi Numeric value for the expected additive genetic variance in each
#' generation of immigrants
#' @param VRf Numeric value for the expected residual variance in the focal
#' population
#' @param VRi Numeric value for the expected residual variance in each
#' generation of the immigrants
#' @param mup Numeric value for the expected mean phenotypic value in the first
#' generation of the focal population - the founders
#' @param muf Numeric value for the expected mean breeding value in the first
#' generation of the focal population - the founders
#' @param mui Numeric value for the expected mean breeding value for the
#' immigrants
#' @param murf Numeric value for the expected mean residual (environmental)
#' deviation in the first generation of the focal population - the founders
#' @param muri Numeric value for the expected mean residual (environmental)
#' deviation for the immigrants
#' @param d_bvf Numeric value for the expected change between generations in
#' the mean breeding value of the focal population. Sets the rate of genetic
#' selection occurring across generations
#' @param d_bvi Numeric value for the expected change between generations in
#' the mean breeding value of the immigrant population each generation
#' @param d_rf Numeric value for the expected change between generations in the
#' mean residual (environmental) deviation of the focal population each
#' generation
#' @param d_ri Numeric value for the expected change between generations in the
#' mean residual (environmental) deviation of the immigrant population each
#' generation
#'
#' @return A \code{data.frame} with columns corresponding to:
#' \describe{
#' \item{id }{Integer for each individual's unique identifying code}
#' \item{dam }{Integer indicating each individual's dam}
#' \item{sire }{Integer indicating each individual's sire}
#' \item{parAvgU }{Numeric value for the average of each individual's dam
#' and sire additive genetic effects}
#' \item{mendel }{Numeric value for each individual's Mendelian sampling
#' deviate from the mid-parental total additive genetic value}
#' \item{u }{Numeric value of each individual's total additive genetic
#' effect}
#' \item{r }{Numeric value of each individual's residual (environmental)
#' deviation}
#' \item{p }{Numeric value of each individual's phenotypic value}
#' \item{pred.u }{Numeric value of each individual's predicted total
#' additive genetic effect}
#' \item{is }{Integer of either \code{0} if an individual was born in the
#' focal population or \code{1} if they were born in an immigrant
#' population}
#' \item{gen }{Integer value of the generation in which each individual was
#' born}
#' }
#' @author \email{matthewwolak@@gmail.com}
#' @seealso \code{\link{ggTutorial}}
#' @references Verrier, V., J.J. Colleau, and J.L. Foulley. 1993. Long-term
#' effects of selection based on the animal model BLUP in a finite population.
#' Theoretical and Applied Genetics. 87:446-454.
#'
#' Mrode, R.A. 2005. Linear Models for the Prediction of Animal Breeding
#' Values, 2nd ed. Cambridge, MA: CABI Publishing.
#' @examples
#'
#' \dontrun{
#' # The dataset 'ggTutorial' was simulated as:
#' set.seed(102) # seed used to simulate ggTutorial
#' ggTutorial <- simGG(K = 400, pairs = 200, noff = 4, g = 15,
#' nimm = 40,
#' muf = 0, mui = 3)
#' }
#'
#' # Use genetic group methods to approximate the breeding values for ggTutorial
#' ## First, construct a pedigree with genetic groups
#' ggPed <- ggTutorial[, c("id", "dam", "sire", "is", "gen")]
#' naPar <- which(is.na(ggPed[, 2]))
#' ggPed$GG <- rep("NA", nrow(ggPed))
#' # 'focal' population genetic group = "foc0" and 'immigrant' = "g1"
#' # obtained by pasting "foc" & "g" with immigrant status "0" or "1", respectively
#' ggPed$GG[naPar] <- as.character(ggPed$is[naPar])
#' ggPed$GG[ggPed$GG == "0"] <- paste0("foc", ggPed$GG[ggPed$GG == "0"])
#' ggPed$GG[ggPed$GG == "1"] <- paste0("g", ggPed$GG[ggPed$GG == "1"])
#' ggPed[naPar, 2:3] <- ggPed[naPar, "GG"]
#'
#' ## Now create the Q matrix
#' Q <- ggcontrib(ggPed[, 1:3], ggroups = c("foc0", "g1"))
#'
#' ## obtain the true values of the genetic group means
#' foc0_mean <- mean(ggTutorial$u[which(ggTutorial$gen == 1 & ggTutorial$is == 0)])
#' g1_mean <- mean(ggTutorial$u[which(ggTutorial$is == 1)])
#' g_exp <- matrix(c(foc0_mean, g1_mean), ncol = 1)
#'
#' ## breeding values (a) are:
#' ### tot. add. gen. effects (u) minus genetic group effects for each individual (Qg):
#' a <- ggTutorial$u - Q %*% g_exp
#'
#'
#' @export
simGG <- function(K, pairs, noff, g,
nimm = 2, nimmG = seq(2, g-1, 1),
VAf = 1, VAi = 1, VRf = 1, VRi = 1,
mup = 20, muf = 0, mui = 0, murf = 0, muri = 0,
d_bvf = 0, d_bvi = 0, d_rf = 0, d_ri = 0){
if(pairs*2 > K) stop("pairs must be less than half of K")
if(nimmG[1] == 1) stop("immigrants cannot arrive in the first generation")
N <- pairs*noff
da <- array(NA, dim = c(K, 16, g))
dimnames(da) <- list(NULL,
c("id", "dam", "sire", "parAvgG", "parAvgA_f", "parAvgA_i", "mendel_f", "mendel_i", "a_f", "a_i", "r", "p", "pred.u", "is", "gen", "q_imm"), seq(g))
da[, "id", ] <- seq(K*g)
# Assume last nimm rows in each generation are the immigrants:
## 1=immigrant & 0=NOT immigrant
## initially only migrants have non-zero immigrant genetic group contribution (`q_imm`)
da[, c("is", "q_imm"), ] <- 0
da[(K-nimm+1):K, c("is", "q_imm"), nimmG] <- 1
da[, "gen", ] <- rep(seq(g), each = K)
# All individuals need these values=0 initially for math later down
#XXX DELETEME da[, c("a_f", "a_i"), ] <- 0
# Assign genetic group effects to individuals with phantom parents
da[, "parAvgG", 1] <- muf
da[(K-nimm+1):K, "parAvgG", nimmG] <- mui
# Create standard normals for generation 1
## No immigrants in generation 1
da[, "a_f", 1] <- rnorm(K, 0, sqrt(VAf))
da[, "r", 1] <- rnorm(K, murf, sqrt(VRf))
# Calculate phenotypes for generation 1
da[, "p", 1] <- mup + rowSums(da[, c("parAvgG", "a_f", "r"), 1])
# Calculate predicted total additive genetic effects for generation 1
## equation 1.3 in Mrode (2005, p. 3)
da[, "pred.u", 1] <- (VAf / (VAf + VRf)) * (da[, "p", 1] - mean(da[, "p", 1]))
################# Loop through generations ##############################
# Mating
for(i in 2:g){
if(i %in% nimmG) Knimm <- K-nimm else Knimm <- K
###### Determine parents of next generation ###########
# Parents ranked: how close pred.u is to the next generation's mean u
## Define function to do this
prFun <- function(x, theta = 0){
exp((-1 / (2*sd(x))) * (x - theta)^2)
}
### End function definition
# Arbitrarily choose sexes (for assignment as female=0 or male=1 parent)
sexvec <- vector("integer", length = K)
sexvec[sample.int(K, size = K*0.5, replace = FALSE)] <- 1
if(d_bvf == 0){
prin0 <- prin1 <- NULL
} else{
prin0 <- prFun(da[sexvec == 0, "pred.u", i-1],
muf + d_bvf*sqrt(VAf)*(i-2)) # no selection in first generation hence i-2
prin1 <- prFun(da[sexvec == 1, "pred.u", i-1],
muf + d_bvf*sqrt(VAf)*(i-2)) # no selection in first generation hence i-2
}
# Sampling WITH replacement to assign parents
pool0 <- sample(x = da[sexvec == 0, "id", i-1], size = pairs, replace = TRUE,
prob = prin0)
pool1 <- sample(x = da[sexvec == 1, "id", i-1], size = pairs, replace = TRUE,
prob = prin1)
iOff <- matrix(rep(c(pool0, pool1), each = noff), ncol = 2)
da[1:Knimm, c("dam", "sire"), i] <- iOff[sort(sample(seq(nrow(iOff)),
size = Knimm, replace = FALSE)), ]
###### Inheritance of genetic values ###########
# Average of parent genetic group effects (g)
## Should be equivalent to Q[i, ] %*% g OR summation of q_ij * g_j
da[1:Knimm, "parAvgG", i] <- rowMeans(matrix(da[match(da[1:Knimm, c("dam", "sire"), i],
da[, "id", i-1]), "parAvgG", i-1], ncol = 2, byrow = FALSE))
# Average of parent immigrant genetic group contributions (`q_imm`)
da[1:Knimm, "q_imm", i] <- rowMeans(matrix(da[match(da[1:Knimm, c("dam", "sire"), i],
da[, "id", i-1]), "q_imm", i-1], ncol = 2, byrow = FALSE))
# Average of parent founder breeding values (a_f)
### parAvgA_f = 0.5 * (a_f_dam + a_f_sire)
### Based on T (not group-specific T_j) matrix; see eqn. 12, Muff et al. 2019
da[1:Knimm, "parAvgA_f", i] <- .rowMeans(matrix(da[match(da[1:Knimm, c("dam", "sire"), i],
da[, "id", i-1]), "a_f", i-1], ncol = 2, byrow = FALSE),
m = Knimm, n = 2, na.rm = TRUE)
# Average of parent immigrant breeding values (a_i)
### parAvgA_i = 0.5 * (a_i_dam + a_i_sire)
### Based on T (not group-specific T_j) matrix; see eqn. 12, Muff et al. 2019
da[1:Knimm, "parAvgA_i", i] <- .rowMeans(matrix(da[match(da[1:Knimm, c("dam", "sire"), i],
da[, "id", i-1]), "a_i", i-1], ncol = 2, byrow = FALSE),
m = Knimm, n = 2, na.rm = TRUE)
# Assign Mendelian sampling variation
# Within-family additive genetic variance
## p. 447, second eqn. in Verrier, Colleau, & Foulley. 1993. Theor. Appl. Genetics
# Genetic group specific VAs - mean group-specific sampling variances
## Splitting breeding values/Mendelian sampling deviations
#### Muff et al. 2019 (eqn 10) gives proportion of group VA for distribution of
##### group-specific Mendelian sampling deviations
## Note, immigration not allowed in i=1
if(i == 2){ #<-- all ancestors from same genetic group: don't split BVs
da[1:Knimm, "mendel_f", i] <- rnorm(Knimm, 0, sqrt(0.5 * VAf))
} else{
tmpPed <- prunePed(data.frame(apply(da[, 1:3, 1:i], MARGIN = 2,
FUN = function(x){x})),
as.character(unique(c(da[1:Knimm, "id", i]))))
igen_dii <- makeDiiF(tmpPed)$D@x[match(as.character(c(da[1:Knimm, "id", i])),
tmpPed[, 1])]
## Split Mendelian sampling deviations (approximation: Muff et al. eqn 10)
### Introduces q_j^2, because T (not T_j) matrix used for parental average breeding values
#### Thus, d_ii below are actually from D_tilda_j (Muff et al. 2019 eqn. 12)
###TODO use group-specific inbreeding coefficients (Muff et al. eqn 11)
da[1:Knimm, "mendel_f", i] <- rnorm(Knimm, 0,
## q_i,founder^2 * 1 - q_i,founder * (1 - d_ii)
sqrt((1-da[1:Knimm, "q_imm", i])^2 * (1 - (1-da[1:Knimm, "q_imm", i]) * (1 - igen_dii)) * VAf))
### check: assign mendel_f=0 for q_imm==1 (i.e., 0 founder alleles)
da[1:Knimm, "mendel_f", i][which(da[1:Knimm, "q_imm", i] == 1)] <- 0
da[1:Knimm, "mendel_i", i] <- rnorm(Knimm, 0,
## q_i,founder^2 * 1 - q_i,immigrant * (1 - d_ii)
sqrt(da[1:Knimm, "q_imm", i]^2 * (1 - da[1:Knimm, "q_imm", i] * (1 - igen_dii)) * VAi))
### check: assign mendel_i=0 for q_imm==0 (i.e., 0 immigrant alleles)
da[1:Knimm, "mendel_i", i][which(da[1:Knimm, "q_imm", i] == 0)] <- 0
}
# breeding values: ai_j = 0.5(as_j + ad_j) + mi_j
da[1:Knimm, "a_f", i] <- .rowSums(da[1:Knimm, c("parAvgA_f", "mendel_f"), i],
m = Knimm, n = 2, na.rm = TRUE)
da[1:Knimm, "a_i", i] <- .rowSums(da[1:Knimm, c("parAvgA_i", "mendel_i"), i],
m = Knimm, n = 2, na.rm = TRUE)
# Calculate predicted total additive genetic effects (predicted u)
## average of parents' predicted total additive genetic effects
## equation 1.9 in Mrode (2005, p. 10)
da[1:Knimm, "pred.u", i] <- rowMeans(matrix(da[match(da[1:Knimm, c("dam", "sire"), i],
da[, "id", i-1]), "pred.u", i-1], ncol = 2, byrow = FALSE))
# Residual deviations
da[1:Knimm, "r", i] <- rnorm(Knimm, murf + d_rf*sqrt(VRf)*(i-1), sqrt(VRf))
#############
# Immigrants: assumed outbred
if(i %in% nimmG){
# no trend in first generation (hence i-2)
da[(Knimm+1):K, "a_i", i] <- rnorm(nimm, d_bvi*sqrt(VAi)*(i-2), sqrt(VAi))
# no trend in first generation (hence i-2)
da[(Knimm+1):K, "r", i] <- rnorm(nimm, muri + d_ri*sqrt(VRi)*(i-2), sqrt(VRi))
}
#############
# All individuals in generation 'i'
# Calculate phenotypes
da[, "p", i] <- mup + .rowSums(da[, c("parAvgG", "a_f", "a_i", "r"), i],
m = K, n = 4, na.rm = TRUE)
# Calculate predicted total additive genetic effects for immigrants
## equation 1.3 in Mrode (2005, p. 3)
if(i %in% nimmG){
da[(Knimm+1):K, "pred.u", i] <- (VAi / (VAi + VRi)) * (da[(Knimm+1):K, "p", i] - mean(da[(Knimm+1):K, "p", i]))
}
} # End for i loop through generations
# create a data.frame out of the array
df <- data.frame(apply(da, MARGIN = 2, FUN = function(x){x}))
df
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.