mixed_model: Fit a Mixed Model (lme4GS)
In brandon-mosqueda/SKM: Sparse Kernels Methods

mixed_model

R Documentation

Fit a Mixed Model (lme4GS)

Description

mixed_model() is a wrapper of the lme4GS::lmerUvcov() and function to fit models for Genomic Selection. It only supports univariate models with a numeric response variable.

Usage

mixed_model(
  x,
  y,
  testing_indices = NULL,
  validate_params = TRUE,
  seed = NULL,
  verbose = TRUE
)

Arguments

`x`	(`list`) The predictor (independent) variable(s). It is expected a `list` with nested `list`'s where each inner `list` is named and represents a predictor effect. Such inner `list`'s must have a field `x` with the: square `matrix` of predictor variables.
`y`	(`numeric`) The response (dependent) variable(s). As this function only works for univariate analysis, a numeric vector is always expected. `y` can contain missing values (`NA`) which represent the observations to be used as testing set along with the provided indices in `testing_indices` parameter.
`testing_indices`	(`numeric`) The records' indices to be used as testing set along all that contain missing values in `y`. `NULL` by default.
`validate_params`	(`logical(1)`) Should the parameters be validated? It is not recommended to set this parameter to `FALSE` because if something fails a non meaningful error is going to be thrown. `TRUE` by default.
`seed`	(`numeric(1)`) A value to be used as internal seed for reproducible results. `NULL` by default.
`verbose`	(`logical(1)`) Should the progress information be printed? `TRUE` by default.

Details

This functions has a similar work as the bayesian_model function. Unlike other models, if you want to fit a Mixed model and make some predictions you have to provide the whole data (for training and testing) and the records' indices to be used as testing (testing_indices). All records with NA values in y are considered as part of testing set too. After fitting the model, the predicted values can be obtained with the predict function, with no more parameter than the model, see Examples section below for more information.

Value

An object of class "MixedModel" that inherits from classes "Model" and "R6" with the fields:

fitted_model: An object of class lme4GS::lmerUvcov() with the model.
x: The final list used to fit the model.
y: The final vector or matrix used to fit the model.
execution_time: A difftime object with the total time taken to tune and fit the model.
removed_rows: A numeric vector with the records' indices (in the provided position) that were deleted and not taken in account in tunning nor training.
removed_x_cols: A numeric vector with the columns' indices (in the provided positions) that were deleted and not taken in account in tunning nor training.
...: Some other parameters for internal use.

Examples

setwd("~/data_science/SKM")

roxygen2::roxygenise()

data(Maize)

# Data preparation of G
Line <- model.matrix(~ 0 + Line, data = Maize$Pheno)
LineGeno <- Line %*% Maize$Geno %*% t(Line)
Env <- model.matrix(~ 0 + Env, data = Maize$Pheno)
KEnv <- Env %*% t(Env) / ncol(Env)

# Identify the model
X <- list(
  Env = list(x = KEnv),
  LinexGeno = list(x = LineGeno)
)
y <- Maize$Pheno$Y

# Set seed for reproducible results
set.seed(2022)
folds <- cv_kfold(records_number = nrow(LineGeno), k = 5)

Predictions <- data.frame()

# Model training and predictions
for (i in seq_along(folds)) {
  cat("*** Fold:", i, "***\n")
  fold <- folds[[i]]

  # Model training
  model <- mixed_model(
    x = X,
    y = y,
    testing_indices = fold$testing
  )

  # Prediction of testing set
  predictions <- predict(model)

  # Predictions for the i-th fold
  FoldPredictions <- data.frame(
    Fold = i,
    Line = Maize$Pheno$Line[fold$testing],
    Env = Maize$Pheno$Env[fold$testing],
    Observed = y[fold$testing],
    Predicted = predictions$predicted
  )
  Predictions <- rbind(Predictions, FoldPredictions)
}

head(Predictions)
# Compute the summary of all predictions
summaries <- gs_summaries(Predictions)

# Summaries by Line
head(summaries$line)

# Summaries by Environment
summaries$env

# Summaries by Fold
summaries$fold

brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.