kFoldCrossValidation: Perform K-Fold Cross-Validation for corHMM Models

View source: R/corHMMDredge.R

kFoldCrossValidationR Documentation

Perform K-Fold Cross-Validation for corHMM Models

Description

This function performs k-fold cross-validation on a given corHMM model by dividing the data into k equally sized subsets. The function evaluates model performance across multiple lambda regularization values, if provided. Optionally, it can save the trained models for each fold and return the cross-validation results.

Usage

kFoldCrossValidation(corhmm_obj, k, lambdas = NULL, return_model = TRUE, 
save_model_dir = NULL, model_name = NULL)

Arguments

corhmm_obj

A corHMM object that contains a fitted model.

k

An integer specifying the number of folds to divide the data into for cross-validation.

lambdas

A numeric vector of lambda regularization values to evaluate during cross-validation. If NULL, the lambda value from corhmm_obj will be used. Defaults to NULL.

return_model

A logical value indicating whether to return the trained models for each fold. Defaults to TRUE.

save_model_dir

A character string specifying the directory to save the trained models for each fold. If NULL, models will not be saved. Defaults to NULL.

model_name

A character string specifying the base name for saved model files. If NULL, a default name "corhmm.obj" is used. Defaults to NULL.

Details

The function splits the data into k folds and trains a separate corHMM model for each fold by leaving one fold out as the test set. The remaining folds are used for training the model. The performance of the model is evaluated on the test set using a divergence-based (Jensen-Shannon Divergence) scoring method. Evaluations are based on estimating the tips which were removed for that particular fold given the newly fitted model.

The function supports evaluating models across different lambda regularization values. If lambdas are provided, models are trained and evaluated for each lambda value. The results, including the models (if return_model = TRUE) and cross-validation scores, are returned as a list.

Value

A list of cross-validation results, including the following components:

models

A list of the trained models for each fold (if return_model = TRUE).

scores

A numeric vector of the cross-validation scores for each fold.

averageScore

The average cross-validation score across all folds.

Author(s)

James D. Boyko

Examples


#data(primates)
#phy <- multi2di(primates[[1]])
#data <- primates[[2]]
#dredge_fits <- corHMMDredge(phy = phy, data = data, 
# max.rate.cat = 1, pen.type = "l1", 
#	root.p = "maddfitz", lambda = 1, nstarts = 10, n.cores = 10)
#model_table <- getModelTable(dredge_fits)
#dredge_model <- dredge_fits[[which.min(model_table$dAIC)]]
#k_fold_res <- kFoldCrossValidation(dredge_model,
# k = 5, lambdas = c(0,0.25,0.5,0.75,1))
#cv_table <- getCVTable(k_fold_res)


thej022214/corHMM documentation built on April 13, 2025, 9:37 a.m.