predict.MixedModel: Predict Mixed model

View source: R/mixed_model_model.R

predict.MixedModelR Documentation

Predict Mixed model

Description

Obtains the predictions using a fitted model object of class MixedModel.

Usage

## S3 method for class 'MixedModel'
predict(model, indices = NULL, format = "list")

Arguments

model

(Model) An object of a fitted model.

indices

(numeric) A numeric vector with the indices of the elements used to fit the model you want the predictions. NULL by default which uses the indices specified in testing_indices when the model was fitted or those elements with NA values.

format

(character(1)) The expected format of the predictions. The available options are "list" and "data.frame". "data.frame" is more useful with multivariate models if you want in a tabular structure the predicted values. See Value section below for more information. "list" by default.

Value

When format is "list"

For univariate models a named list with the element "predicted" which contains the predicted values is returned. For categorical variables the returned list includes the element "probabilities" too with a data.frame of the predicted probabilities of each class.

For multivariate models a named list is returned where there is an named element for each response variable in the fitted model. Each element of this list contains a inner list in the same format as described for the univariate case, so for categorical variables, a data.frame with the predicted probabilities is included too.

When format is "data.frame"

For univariate models a data.frame with the column predicted which contains the predicted values. For categorical variables, a column for each class with the predicted probability of this class is included too.

For multivariate models a data.frame with a column for each response variable with the predicted values of each response.

Examples

setwd("~/data_science/SKM")

roxygen2::roxygenise()

data(Maize)

# Data preparation of G
Line <- model.matrix(~ 0 + Line, data = Maize$Pheno)
LineGeno <- Line %*% Maize$Geno %*% t(Line)
Env <- model.matrix(~ 0 + Env, data = Maize$Pheno)
KEnv <- Env %*% t(Env) / ncol(Env)

# Identify the model
X <- list(
  Env = list(x = KEnv),
  LinexGeno = list(x = LineGeno)
)
y <- Maize$Pheno$Y

# Set seed for reproducible results
set.seed(2022)
folds <- cv_kfold(records_number = nrow(LineGeno), k = 5)

Predictions <- data.frame()

# Model training and predictions
for (i in seq_along(folds)) {
  cat("*** Fold:", i, "***\n")
  fold <- folds[[i]]

  # Model training
  model <- mixed_model(
    x = X,
    y = y,
    testing_indices = fold$testing
  )

  # Prediction of testing set
  predictions <- predict(model)

  # Predictions for the i-th fold
  FoldPredictions <- data.frame(
    Fold = i,
    Line = Maize$Pheno$Line[fold$testing],
    Env = Maize$Pheno$Env[fold$testing],
    Observed = y[fold$testing],
    Predicted = predictions$predicted
  )
  Predictions <- rbind(Predictions, FoldPredictions)
}

head(Predictions)
# Compute the summary of all predictions
summaries <- gs_summaries(Predictions)

# Summaries by Line
head(summaries$line)

# Summaries by Environment
summaries$env

# Summaries by Fold
summaries$fold

brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.