mer_cvrisk: Cluster-sensitive Cross-Validation

View source: R/mermboost_functions.R

mer_cvriskR Documentation

Cluster-sensitive Cross-Validation

Description

Cross-validated estimation of the empirical risk for hyper-parameter selection. Folds are created cluster-sensitive, hence splitting data into train and tests sets considers the cluster-structure.

Usage

mer_cvrisk(object, folds, no_of_folds, cores = 1)

Arguments

object

an object of class mermboost.

folds

a weight matrix with number of rows equal to the number of observations. The number of columns corresponds to the number of cross-validation runs. Can be computed using function cv.

no_of_folds

creates the folds itself by taking the cluster structure into account.

cores

is passed on to mclapply for parallel computing.

Details

The number of boosting iterations is a hyper-parameter of the boosting algorithms implemented in this package. Honest, i.e., cross-validated, estimates of the empirical risk for different stopping parameters mstop are computed by this function which can be utilized to choose an appropriate number of boosting iterations to be applied.

This function uses the cluster-identifier held in the mermboost object to split the data into cluster-sensitive folds if the corresponding argument no_of_folds is given. As this might lead to imbalanced splits the 1/0 matrix of folds can be given manually via the folds argument.

Value

An object of class mer_cv, containing the k-folds as a matrix, the corresponding estimates of the empirical risks, their average and the results optimal stopping iteration. plot and mstop methods are available.

Examples

data(Orthodont)

mod <- mermboost(distance ~ bbs(age, knots = 4) + bols(Sex) + (1 |Subject),
                 data = Orthodont, family = gaussian,
                 control = boost_control(mstop = 100))

# let mermboost do the cluster-sensitive cross-validation for you
norm_cv <- mer_cvrisk(mod, no_of_folds = 10)
opt_m <- mstop(norm_cv)
plot(norm_cv)


mermboost documentation built on April 4, 2025, 1:41 a.m.