cross_val_score: Cross-Validation for Model Objects

View source: R/cross_val_score.R

cross_val_scoreR Documentation

Cross-Validation for Model Objects

Description

Perform k-fold cross-validation with consistent scoring metrics across different model types. The scoring metric is automatically selected based on the detected task type.

Usage

cross_val_score(
  model,
  X,
  y,
  cv = 5,
  scoring = NULL,
  show_progress = TRUE,
  verbose = TRUE,
  cl = NULL,
  seed = 123,
  fit_params = NULL,
  predict_params = NULL
)

Arguments

model

A Model object

X

Feature matrix or data.frame

y

Target vector (type determines regression vs classification)

cv

Number of cross-validation folds (default: 5)

scoring

Scoring metric: "rmse", "mae", "accuracy", "f1", or a custom function with signature function(true, pred) returning a scalar. Default: auto-detected based on task type.

show_progress

Whether to show progress bar (default: TRUE) in sequential mode

verbose

logical flag enabling verbose messages (default: TRUE) in parallel mode

cl

Optional number of clusters for parallel processing If using cl for parallel execution, custom scoring functions must be self-contained (no dependencies on the calling environment).

seed

Reproducibility seed

fit_params

A list of additional arguments passed to model$fit()

predict_params

A list of additional arguments passed to model$predict()

Value

Vector of cross-validation scores for each fold

Examples

## Not run: 
library(glmnet)
X <- matrix(rnorm(100), ncol = 4)
y <- 2*X[,1] - 1.5*X[,2] + rnorm(25)  # numeric -> regression

mod <- Model$new(glmnet::glmnet)
(cv_scores <- cross_val_score(mod, X, y, cv = 5))  # auto-uses RMSE
mean(cv_scores)  # Average RMSE

cross_val_score(mod, X, y,
fit_params     = list(alpha = 0, lambda = 0.1),
predict_params = list(type = "response"))

cross_val_score(mod, X, y,
fit_params     = list(alpha = 0.5, lambda = 0.1),
predict_params = list(type = "response"))

# Custom scoring: R-squared
r2 <- function(true, pred) {
  ss_res <- sum((true - pred)^2)
  ss_tot <- sum((true - mean(true))^2)
  1 - ss_res / ss_tot
}

(cv_scores4 <- cross_val_score(mod, X, y, cv = 5, scoring = r2))
mean(cv_scores4)  # Average R²

# Classification with accuracy scoring
data(iris)
X_class <- iris[, 1:4]
y_class <- iris$Species  # factor -> classification
mod2 <- Model$new(e1071::svm)
(cv_scores2 <- cross_val_score(mod2, X_class, y_class, cv = 5))  # auto-uses accuracy
mean(cv_scores2)  # Average accuracy

iris_bin <- iris[iris$Species != "virginica", ]
X_bin <- iris_bin[, 1:4]
y_bin <- droplevels(iris_bin$Species)
(cv_scores3 <- cross_val_score(mod2, X_bin, y_bin, cv = 3, 
scoring="f1", fit_params=list(kernel="polynomial")))  
mean(cv_scores3)  # Average F1

## End(Not run)


unifiedml documentation built on May 5, 2026, 9:06 a.m.