predict.MixedModel: Predict Mixed model
In brandon-mosqueda/SKM: Sparse Kernels Methods

predict.MixedModel

R Documentation

Predict Mixed model

Description

Obtains the predictions using a fitted model object of class MixedModel.

Usage

## S3 method for class 'MixedModel'
predict(model, indices = NULL, format = "list")

Arguments

`model`	(`Model`) An object of a fitted model.
`indices`	(`numeric`) A numeric vector with the indices of the elements used to fit the model you want the predictions. `NULL` by default which uses the indices specified in `testing_indices` when the model was fitted or those elements with `NA` values.
`format`	(`character(1)`) The expected format of the predictions. The available options are `"list"` and `"data.frame"`. `"data.frame"` is more useful with multivariate models if you want in a tabular structure the predicted values. See Value section below for more information. `"list"` by default.

Value

When `format` is `"list"`

For univariate models a named list with the element "predicted" which contains the predicted values is returned. For categorical variables the returned list includes the element "probabilities" too with a data.frame of the predicted probabilities of each class.

For multivariate models a named list is returned where there is an named element for each response variable in the fitted model. Each element of this list contains a inner list in the same format as described for the univariate case, so for categorical variables, a data.frame with the predicted probabilities is included too.

When `format` is `"data.frame"`

For univariate models a data.frame with the column predicted which contains the predicted values. For categorical variables, a column for each class with the predicted probability of this class is included too.

For multivariate models a data.frame with a column for each response variable with the predicted values of each response.

Examples

setwd("~/data_science/SKM")

roxygen2::roxygenise()

data(Maize)

# Data preparation of G
Line <- model.matrix(~ 0 + Line, data = Maize$Pheno)
LineGeno <- Line %*% Maize$Geno %*% t(Line)
Env <- model.matrix(~ 0 + Env, data = Maize$Pheno)
KEnv <- Env %*% t(Env) / ncol(Env)

# Identify the model
X <- list(
  Env = list(x = KEnv),
  LinexGeno = list(x = LineGeno)
)
y <- Maize$Pheno$Y

# Set seed for reproducible results
set.seed(2022)
folds <- cv_kfold(records_number = nrow(LineGeno), k = 5)

Predictions <- data.frame()

# Model training and predictions
for (i in seq_along(folds)) {
  cat("*** Fold:", i, "***\n")
  fold <- folds[[i]]

  # Model training
  model <- mixed_model(
    x = X,
    y = y,
    testing_indices = fold$testing
  )

  # Prediction of testing set
  predictions <- predict(model)

  # Predictions for the i-th fold
  FoldPredictions <- data.frame(
    Fold = i,
    Line = Maize$Pheno$Line[fold$testing],
    Env = Maize$Pheno$Env[fold$testing],
    Observed = y[fold$testing],
    Predicted = predictions$predicted
  )
  Predictions <- rbind(Predictions, FoldPredictions)
}

head(Predictions)
# Compute the summary of all predictions
summaries <- gs_summaries(Predictions)

# Summaries by Line
head(summaries$line)

# Summaries by Environment
summaries$env

# Summaries by Fold
summaries$fold

brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.