runCrossVal | R Documentation |
Assess the accuracy of predicted previously unobserved genotypes (individuals) based on the available training data. Runs k-fold cross-validation for potentially multiple traits and optionally computing prediction accuracy on user-specified selection index. Three models are enabled: additive-only ("A"), additive-plus-dominance ("AD") and a directional-dominance model that incorporates a genome-wide homozygosity effect ("DirDom"). The union of all genotypes scored for all traits is broken into k-folds a user specified number of times. Subsequently each train-test pair is predicted for each trait and accuracies are computed.
runCrossVal( blups, modelType, selInd, SIwts = NULL, grms, dosages = NULL, nrepeats, nfolds, ncores = 1, nBLASthreads = NULL, gid = "GID", seed = NULL, ... )
blups |
nested data.frame with list-column "TrainingData" containing BLUPs. Each element of "TrainingData" list, is data.frame with de-regressed BLUPs, BLUPs and weights (WT) for training and test. |
modelType |
string, "A", "AD", "DirDom". modelType="A": additive-only, GEBVS modelType="AD": the "classic" add-dom model, GEBVS+GEDDs = GETGVs modelType="DirDom": the "genotypic" add-dom model with prop. homozygous fit as a fixed-effect, to estimate a genome-wide inbreeding effect. obtains add-dom effects, computes allele sub effects (α = a + d(q-p)) incorporates into GEBV and GETGV. "DirDom" requires dosages |
selInd |
logical, TRUE/FALSE, selection index accuracy estimates,
requires input weights via |
SIwts |
required if |
grms |
list of GRMs where each element is named either A, D, or, AD. Matrices supplied must match required by A, AD and ADE models. For ADE grms=list(A=A,D=D) |
dosages |
dosage matrix. required only for modelType=="DirDom". Assumes SNPs coded 0, 1, 2. Nind rows x Nsnp cols, numeric matrix, with rownames and colnames to indicate SNP/ind ID |
nrepeats |
number of repeats |
nfolds |
number of folds |
ncores |
number of cores, parallelizes across repeat-folds |
nBLASthreads |
number of cores for each worker to use for multi-thread BLAS |
gid |
string variable name used for genotype ID's/ in e.g. |
seed |
numeric, use seed to achieve reproducibile train-test folds. |
... |
Returns tidy results in a tibble with accuracy estimates for each rep-fold in a list-column "accuracyEstOut".
Other CrossVal:
runParentWiseCrossVal()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.