mixed_model: Fit a Mixed Model (lme4GS)

View source: R/mixed_model.R

mixed_modelR Documentation

Fit a Mixed Model (lme4GS)

Description

mixed_model() is a wrapper of the lme4GS::lmerUvcov() and function to fit models for Genomic Selection. It only supports univariate models with a numeric response variable.

Usage

mixed_model(
  x,
  y,
  testing_indices = NULL,
  validate_params = TRUE,
  seed = NULL,
  verbose = TRUE
)

Arguments

x

(list) The predictor (independent) variable(s). It is expected a list with nested list's where each inner list is named and represents a predictor effect. Such inner list's must have a field x with the: square matrix of predictor variables.

y

(numeric) The response (dependent) variable(s). As this function only works for univariate analysis, a numeric vector is always expected. y can contain missing values (NA) which represent the observations to be used as testing set along with the provided indices in testing_indices parameter.

testing_indices

(numeric) The records' indices to be used as testing set along all that contain missing values in y. NULL by default.

validate_params

(logical(1)) Should the parameters be validated? It is not recommended to set this parameter to FALSE because if something fails a non meaningful error is going to be thrown. TRUE by default.

seed

(numeric(1)) A value to be used as internal seed for reproducible results. NULL by default.

verbose

(logical(1)) Should the progress information be printed? TRUE by default.

Details

This functions has a similar work as the bayesian_model function. Unlike other models, if you want to fit a Mixed model and make some predictions you have to provide the whole data (for training and testing) and the records' indices to be used as testing (testing_indices). All records with NA values in y are considered as part of testing set too. After fitting the model, the predicted values can be obtained with the predict function, with no more parameter than the model, see Examples section below for more information.

Value

An object of class "MixedModel" that inherits from classes "Model" and "R6" with the fields:

  • fitted_model: An object of class lme4GS::lmerUvcov() with the model.

  • x: The final list used to fit the model.

  • y: The final vector or matrix used to fit the model.

  • execution_time: A difftime object with the total time taken to tune and fit the model.

  • removed_rows: A numeric vector with the records' indices (in the provided position) that were deleted and not taken in account in tunning nor training.

  • removed_x_cols: A numeric vector with the columns' indices (in the provided positions) that were deleted and not taken in account in tunning nor training.

  • ...: Some other parameters for internal use.

See Also

predict.MixedModel()

Other models: bayesian_model(), deep_learning(), generalized_boosted_machine(), generalized_linear_model(), partial_least_squares(), random_forest(), support_vector_machine()

Examples

setwd("~/data_science/SKM")

roxygen2::roxygenise()

data(Maize)

# Data preparation of G
Line <- model.matrix(~ 0 + Line, data = Maize$Pheno)
LineGeno <- Line %*% Maize$Geno %*% t(Line)
Env <- model.matrix(~ 0 + Env, data = Maize$Pheno)
KEnv <- Env %*% t(Env) / ncol(Env)

# Identify the model
X <- list(
  Env = list(x = KEnv),
  LinexGeno = list(x = LineGeno)
)
y <- Maize$Pheno$Y

# Set seed for reproducible results
set.seed(2022)
folds <- cv_kfold(records_number = nrow(LineGeno), k = 5)

Predictions <- data.frame()

# Model training and predictions
for (i in seq_along(folds)) {
  cat("*** Fold:", i, "***\n")
  fold <- folds[[i]]

  # Model training
  model <- mixed_model(
    x = X,
    y = y,
    testing_indices = fold$testing
  )

  # Prediction of testing set
  predictions <- predict(model)

  # Predictions for the i-th fold
  FoldPredictions <- data.frame(
    Fold = i,
    Line = Maize$Pheno$Line[fold$testing],
    Env = Maize$Pheno$Env[fold$testing],
    Observed = y[fold$testing],
    Predicted = predictions$predicted
  )
  Predictions <- rbind(Predictions, FoldPredictions)
}

head(Predictions)
# Compute the summary of all predictions
summaries <- gs_summaries(Predictions)

# Summaries by Line
head(summaries$line)

# Summaries by Environment
summaries$env

# Summaries by Fold
summaries$fold

brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.