knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(predCrossVar)

Intro

The predCrossVar package contains a complete set of functions for the prediction of additive and dominance genetic variances and co-variances among full-siblings based on parents, following [@lehermeier2017; @lehermeier2017a]. predCrossVar enables the prediction of genetic variance on a multi-trait selection index. Built for diploid organisms with phased, chromosome- or linkage-group ordered biallelic marker data, and a centimorgan-scale genetic map.

Predicting genetic variance

Summarize formula for predicting genetic variance in crosses. Thanks to those whose work I have learned from and built upon.

Notice: Package dev. status

At this early development stage, package is unforgiving. There are not many arguments with defaults. Input formats are pretty specific, but well defined.

Especially in the case of many markers and using parallel processing, memory consumption can be potentially very large and needs to be considered. Hopefully there is sig. room for improvement in this area.

Example data

I simulated 100 diploid individuals with 2 chromosomes, 10 QTL each, using the AlphaSimR library. Two correlated traits with additive and dominance effects. Main use is to exemplify the input formats and usage of predCrossVar functions.

Inputs

To predict cross variance, 3 types of input are required:

  1. Phased Parental Haplotypes
  2. Genetic Map (or pairwise-recombination frequencies)
  3. Marker Effects

We start with the included example simulation data. 100 individuals, 2 chromosomes with 10 QTL/chr.

Genetic map

The usual starting place for recombination frequency is a centimorgan-scale genetic map.

data(genmap)
str(genmap)
# utility function to compute matrix of pairwise recombination frequencies. 
recombFreqMat<-genmap2recombfreq(genmap,nChr = 2)
dim(recombFreqMat)

In practice, these matrices get big as they have dimension $N_{snp} \times N_{snp}$. It is recommended to precompute and store the recombFreqMat.

In predicting genetic variance in the $F_1$ cross, the recombFreqMat the matrix $1-(2 \times recombFreqMat)$ is computed. To avoid having to do that separately for each family being predicted, I designed the prediction functions to require $1-(2 \times recombFreqMat)$ as input, to save a great deal of computation. Therefore pre-compute it and store that on disk. Probably this can be improved on!

recombFreqMat<-1-(2*recombFreqMat)
recombFreqMat[1:5,1:5]

Phased parental haplotypes

The example dataset haplos contains 2 chrom. with 10 loci each (columns) for 100 individuals (200 rows).

data("haploMat")
dim(haploMat)
haploMat[1:5,1:10]

Marker effects

The example data come in the form of an list of row-matrices, which is useful for multivariate analyses we will get into later. There is also an example DomEffectsList available to load later.

data("AddEffectsList")
str(AddEffectsList)

Single-trait predictions

The first set of functions we will use are simplest. They only handle single traits. No covariance predictions.

Predict one cross

To start, predict a single cross between parents "1" and "2"

# For the univariate case, input effects should be a column matrix.
addEffectsT1<-t(AddEffectsList$Trait1)
addEffectsT2<-t(AddEffectsList$Trait2)
dim(addEffectsT1)
# Trait 1
predCrossVarA(sireID = "1", damID = "2", 
              addEffects = addEffectsT1,
              haploMat = haploMat,
              recombFreqMat = recombFreqMat)
# Trait 2
predCrossVarA(sireID = "1", damID = "2", 
              addEffects = addEffectsT2,
              haploMat = haploMat,
              recombFreqMat = recombFreqMat)

Predict several crosses

Multiple families can be predicted, with an option for parallel processing across families with the runCrossVarPredsA() (additive-only) and runCrossVarPredsAD() (additive-plus-dominance) functions.

For example, we will pick three parents of interest.

There is a helper function crosses2predict() to make a list of all pairwise matings given an input vecot of parent IDs. Includes self-crosses and the upper-triangle of a mating matrix (i.e. no reciprocal crosses).

ped2predict<-crosses2predict(parents = c("1","2","11"))
ped2predict
# Trait 1 - serial (DEFAULT, ncores = 1)
runCrossVarPredsA(ped = ped2predict,
                  addEffects = addEffectsT1,
                  haploMat = haploMat,
                  recombFreqMat = recombFreqMat)

Multi-core across families

# Trait 1 - parallel (ncores = 6)
runCrossVarPredsA(ped = ped2predict,
                  addEffects = addEffectsT1,
                  haploMat = haploMat,
                  recombFreqMat = recombFreqMat,
                  ncores = 6)
# Trait 2 - parallel (ncores = 6)
runCrossVarPredsA(ped = ped2predict,
                  addEffects = addEffectsT2,
                  haploMat = haploMat,
                  recombFreqMat = recombFreqMat,
                  ncores = 6)

Additive-plus-dominance

data("DomEffectsList")
domEffectsT1<-t(DomEffectsList$Trait1)
domEffectsT2<-t(DomEffectsList$Trait2)
# Trait 1
runCrossVarPredsAD(ped = ped2predict,
                  addEffects = addEffectsT1,
                  domEffects = domEffectsT1,
                  haploMat = haploMat,
                  recombFreqMat = recombFreqMat,
                  ncores = 6)
# Trait 2
runCrossVarPredsAD(ped = ped2predict,
                  addEffects = addEffectsT1,
                  domEffects = domEffectsT1,
                  haploMat = haploMat,
                  recombFreqMat = recombFreqMat,
                  ncores = 6)

Multiple-trait predictions

(including covariances)

predOneCrossVarA() -> predCrossVarsA() -> runMtCrossVarPredsA()

runMtCrossVarPredsA() and runMtCrossVarPredsAD() should supersede in function all the others.

NOTES:

mtpred<-runMtCrossVarPredsAD(CrossesToPredict = ped2predict,
                             AddEffectList = AddEffectsList,
                             DomEffectList = DomEffectsList,
                             haploMat = haploMat,
                             recombFreqMat = recombFreqMat, 
                             ncores = 6)

The results are a nested list structure. Let's unpack them...

library(tidyverse); library(magrittr)
mtpred

The list element "varcovars"

mtpred$varcovars %>% 
  tidyr::unnest_wider(varcomps)

Unpack varcovars to get at the variance and covariance predictions for additive and dominances.

Variances are in column VPM. Output is in long-form.

mtpred$varcovars %>% 
  tidyr::unnest_wider(varcomps) %>% 
  dplyr::select(-totcomputetime) %>% 
  tidyr::unnest(predictedfamvars) %>% 
  tidyr::unnest(predVars) %>% head

Cut out the extra columns and spread the additive-vs-dominance across columns, just to tidy up

mtpred$varcovars %>% 
  tidyr::unnest_wider(varcomps) %>% 
  dplyr::select(-totcomputetime) %>% 
  tidyr::unnest(predictedfamvars) %>% 
  tidyr::unnest(predVars) %>% 
  dplyr::select(-PMV,-Nsegsnps,-totcomputetime) %>% 
  spread(VarComp,VPM) %>% rmarkdown::paged_table()

Note that variance results Trait1==Trait2 should match the single-trait results and co-variance predictions are now included.

Lastly, let's quickly verify that the additive-only function works and that the runMt___ functions won't errory if we give just one cross / trait.

mtpred_A<-runMtCrossVarPredsA(CrossesToPredict = ped2predict,
                              AlleleSubEffectList = AddEffectsList,
                              haploMat = haploMat,
                              recombFreqMat = recombFreqMat, 
                              ncores = 6)
mtpred_A$varcovars %>% 
  tidyr::unnest_wider(varcomps) %>% 
  dplyr::select(-totcomputetime) %>% 
  tidyr::unnest(predictedfamvars) %>% 
  tidyr::unnest(predVars) %>% 
  rmarkdown::paged_table()
st_addefflist<-AddEffectsList
st_addefflist$Trait2<-NULL
stpred_sc<-runMtCrossVarPredsA(CrossesToPredict = ped2predict[1,],
                               AlleleSubEffectList = st_addefflist,
                               haploMat = haploMat,
                               recombFreqMat = recombFreqMat,
                               ncores = 6)
stpred_sc$varcovars %>% 
  tidyr::unnest_wider(varcomps) %>% 
  dplyr::select(-totcomputetime) %>% 
  tidyr::unnest(predictedfamvars) %>% 
  tidyr::unnest(predVars) %>% 
  rmarkdown::paged_table()

A note on marker effects input

data("AddEffectsList")
str(AddEffectsList)

References



wolfemd/predCrossVar documentation built on Dec. 21, 2020, 10:14 a.m.