View source: R/association_functions.R
run_genomic_prediction | R Documentation |
Run genomic prediction given a single response variable (usually a phenotype)
using the BGLR
function. Unlike other snpR functions,
this returns the resulting model directly, so overwrite with caution.
run_genomic_prediction(
x,
facets = NULL,
response,
iterations,
burn_in,
thin,
model = "BayesB",
interpolate = "bernoulli",
ncp = NULL,
ncp.max = 5,
par = FALSE,
verbose = FALSE,
...
)
x |
snpRdata object |
facets |
character, default NULL. Categorical metadata variables by
which to break up analysis. See |
response |
character. Name of the column containing the response variable of interest. Must match a column name in sample metadata. |
iterations |
numeric. Number of iterations to run the MCMC chain for. |
burn_in |
numeric. Number of burn in iterations to run prior to the MCMC chain. |
thin |
numeric. Number of iterations to discard between each recorded data point. |
model |
character, default "BayesB". Prediction model to use, see
description for the ETA argument in |
interpolate |
character, default "bernoulli". Interpolation method for missing data. Options:
. |
ncp |
numeric or NULL, default NULL. Used only if |
ncp.max |
numeric, default 5. Used only if |
par |
numeric or FALSE, default FALSE. If a number specifies the number of processing cores to use across facet levels. Not used if only one facet level. |
verbose |
Logical, default FALSE. If TRUE, some progress updates will be printed to the console. |
... |
additional arguments passed to |
This function is provided as a wrapper to plug snpRdata objects into the
BGLR
function in order to easily run genomic prediction
on a simple model where a single, sample specific meta data variable is
provided as the response variable. To do so, this function formats the data
into a transposed "sn" format, as described in format_snps
using the bernoulli method to interpolate missing genotypes. Several
different prediction models are available, see the documentation the ETA
argument in BGLR
for details. Defaults to the "BayesB"
model, which assumes a "spike-slab" prior for allele effects on phenotype
where most markers have a very small effect size and a few can have a much
larger effect.
Unlike most snpR functions, this function does not support facets, since each
run can be very slow. Instead, an individual facet and facet level of
interest should be selected with subset_snpR_data
. See
examples.
See documentation for BGLR
for more details and for a
full list of references.
A list containing: two parts:
x: The provided snpRdata object with effect sizes merged in.
models: Other model results, a list containing:
model: The model output from
BGLR. See BGLR
.
h2: Estimated heritability of the response variable.
predictions: A data.frame containing the provided phenotypes and the predicted Breeding Values (BVs) for those phenotypes.
William Hemstrom
Pérez, P., and de los Campos, G. (2014). Genetics.
# run and plot a basic prediction
## add some dummy phenotypic data.
dat <- stickSNPs
sample.meta(dat) <- cbind(weight = rnorm(ncol(stickSNPs)),
sample.meta(stickSNPs))
## run prediction
gp <- run_genomic_prediction(dat, response = "weight", iterations = 1000,
burn_in = 100, thin = 10)
## dummy phenotypes vs. predicted Breeding Values for dummy predictions.
# given that weight was randomly assigned, definitely overfit!
with(gp$models$.base_.base$predictions, plot(phenotype, predicted_BV))
## fetch estimated loci effects
get.snpR.stats(gp$x, stats = "genomic_prediction")
## Not run:
# with facets, not run
gp <- run_genomic_prediction(gp$x, facets = "pop", response = "weight",
iterations = 1000, burn_in = 100, thin = 10)
get.snpR.stats(gp$x, facets = "pop", stats = "genomic_prediction")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.