gs_bayesian: Bayesian Cross Validation for Genomic Selection
In brandon-mosqueda/SKM: Sparse Kernels Methods

gs_bayesian

R Documentation

Bayesian Cross Validation for Genomic Selection

Description

This function performs a cross validation using the Bayesian models.

Usage

gs_bayesian(
  Pheno,
  Geno,
  traits,
  folds,
  model = "BGBLUP",
  predictors = c("Env", "Line", "EnvxLine"),
  is_multitrait = FALSE,
  iterations_number = 1500,
  burn_in = 500,
  thinning = 5,
  seed = NULL,
  verbose = TRUE
)

Arguments

`Pheno`	(`data.frame`) The phenotypic data. `Env` and `Line` columns are required.
`Geno`	(`matrix`) The genotypic data. It must be a square matrix with the row and column names set to the line names in `Pheno`.
`traits`	(`character`) The columns' names in `Pheno` to be used as traits.
`folds`	(`list`) A list of folds. Each fold is a named list with two entries: `training`, with a vector of indices for training set, and `testing`, with a vector of indices for testing set. Note that this is default format for `⁠cv_*⁠` functions of SKM libraries.
`model`	(`character`) (case not sensitive) The model to be used. It supports the same model as bayesian_model, that is, `"BGBLUP"`, `"BRR"`, `"Bayes_Lasso"`, `"Bayes_A"`, `"Bayes_B"` or `"Bayes_C"`. In multivariate analysis you can only use `"BGBLUP"` or `"BRR"`. `"BGBLUP"` by default.
`predictors`	(`character`) (case not sensitive) The predictors to be used in the model. At least one of the following options: "Env" for the environment effect, "Line" for the line effect and "EnvxLine" for the interaction between environment and line. `c("Env", "Line", "EnvxLine")` by default.
`is_multitrait`	(`logical(1)`) Is multitrait analysis? `FALSE` by default.
`iterations_number`	(`numeric(1)`) Number of iterations to fit the model. 1500 by default.
`burn_in`	(`numeric(1)`) Number of items to burn at the beginning of the model. 500 by default.
`thinning`	(`numeric(1)`) Number of items to thin the model. 5 by default.
`seed`	(`numeric(1)`) A value to be used as internal seed for reproducible results. `NULL` by default.
`verbose`	(`logical(1)`) Should the progress information be printed? `TRUE` by default.

Value

A GSFastBayesian object with the following attributes:

Pheno: (data.frame) The phenotypic data.
Geno: (matrix) The genotypic data.
traits: (character) The traits' names.
is_multitrait: (logical(1)) Is multitrait analysis?
Predictions: (data.frame) The predictions of cross validation. This data.frame contains the Trait, Fold, Line, Env, Predicted and Observed columns.
execution_time: (difftime) The execution time taken for the analysis.
folds: (list) The folds used in the analysis.
model: (BayesianModel) The model fitted.
model_name: (character(1)) The name of the model.
iterations_number: (numeric(1)) Number of iterations to fit the model.
burn_in: (numeric(1)) Number of items to burn at the beginning of the model.
thinning: (numeric(1)) Number of items to thin the model.

Examples

data(Maize)

folds <- cv_kfold(nrow(Maize$Pheno), k = 5)

results <- gs_bayesian(
  Maize$Pheno,
  Maize$Geno,

  traits = "Y",
  folds = folds,

  is_multitrait = FALSE,

  iterations_number = 10,
  burn_in = 5,
  thinning = 5,

  seed = NULL,
  verbose = TRUE
)

print(results)

brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.