gs_bayesian: Bayesian Cross Validation for Genomic Selection

View source: R/gs_bayesian.R

gs_bayesianR Documentation

Bayesian Cross Validation for Genomic Selection

Description

This function performs a cross validation using the Bayesian models.

Usage

gs_bayesian(
  Pheno,
  Geno,
  traits,
  folds,
  model = "BGBLUP",
  predictors = c("Env", "Line", "EnvxLine"),
  is_multitrait = FALSE,
  iterations_number = 1500,
  burn_in = 500,
  thinning = 5,
  seed = NULL,
  verbose = TRUE
)

Arguments

Pheno

(data.frame) The phenotypic data. Env and Line columns are required.

Geno

(matrix) The genotypic data. It must be a square matrix with the row and column names set to the line names in Pheno.

traits

(character) The columns' names in Pheno to be used as traits.

folds

(list) A list of folds. Each fold is a named list with two entries: training, with a vector of indices for training set, and testing, with a vector of indices for testing set. Note that this is default format for ⁠cv_*⁠ functions of SKM libraries.

model

(character) (case not sensitive) The model to be used. It supports the same model as bayesian_model, that is, "BGBLUP", "BRR", "Bayes_Lasso", "Bayes_A", "Bayes_B" or "Bayes_C". In multivariate analysis you can only use "BGBLUP" or "BRR". "BGBLUP" by default.

predictors

(character) (case not sensitive) The predictors to be used in the model. At least one of the following options: "Env" for the environment effect, "Line" for the line effect and "EnvxLine" for the interaction between environment and line. c("Env", "Line", "EnvxLine") by default.

is_multitrait

(logical(1)) Is multitrait analysis? FALSE by default.

iterations_number

(numeric(1)) Number of iterations to fit the model. 1500 by default.

burn_in

(numeric(1)) Number of items to burn at the beginning of the model. 500 by default.

thinning

(numeric(1)) Number of items to thin the model. 5 by default.

seed

(numeric(1)) A value to be used as internal seed for reproducible results. NULL by default.

verbose

(logical(1)) Should the progress information be printed? TRUE by default.

Value

A GSFastBayesian object with the following attributes:

  • Pheno: (data.frame) The phenotypic data.

  • Geno: (matrix) The genotypic data.

  • traits: (character) The traits' names.

  • is_multitrait: (logical(1)) Is multitrait analysis?

  • Predictions: (data.frame) The predictions of cross validation. This data.frame contains the Trait, Fold, Line, Env, Predicted and Observed columns.

  • execution_time: (difftime) The execution time taken for the analysis.

  • folds: (list) The folds used in the analysis.

  • model: (BayesianModel) The model fitted.

  • model_name: (character(1)) The name of the model.

  • iterations_number: (numeric(1)) Number of iterations to fit the model.

  • burn_in: (numeric(1)) Number of items to burn at the beginning of the model.

  • thinning: (numeric(1)) Number of items to thin the model.

See Also

Other gs_models: gs_fast_bayesian()

Examples

data(Maize)

folds <- cv_kfold(nrow(Maize$Pheno), k = 5)

results <- gs_bayesian(
  Maize$Pheno,
  Maize$Geno,

  traits = "Y",
  folds = folds,

  is_multitrait = FALSE,

  iterations_number = 10,
  burn_in = 5,
  thinning = 5,

  seed = NULL,
  verbose = TRUE
)

print(results)

brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.