cv_random_strata: Stratified random cross validation folds generation

View source: R/utils.R

cv_random_strataR Documentation

Stratified random cross validation folds generation

Description

Generates folds for the stratified random cross validation where you specify the number of folds and the proportion of testing. In each fold a sample without replacement of the specified proportion of testing individuals is taken to be the testing set and all the remaining ones to be the training set, but ensuring each fold contains the same proportion of elements in data.

Usage

cv_random_strata(data, folds_number = 5, testing_proportion = 0.2)

Arguments

data

(vector) The categorical data considered to stratify the folds.

folds_number

(numeric(1)) The number of folds. 5 by default.

testing_proportion

(numeric(1)) The proportion of elements to be included in the testing set in each fold. 0.2 by default.

Value

A list with folds_number elements where each element is a named list with the elements training wich includes the indices of those records to be part of the training set and testing wich includes the indices of those records to be part of the testing set. Training and testing sets of each fold are exhaustive and mutually exclusive.

Examples

## Not run: 
# Generates random data
data <- c(rep("A", 10), rep("B", 20), rep("C", 30))
folds <- cv_random_strata(data, 5, 0.2)
# Indices of training set in fold 1
folds[[1]]$training
# Indices of testing set in fold 1
folds[[1]]$testing
# Verify fold 1 is balanced in training
table(data[folds[[1]]$training])
# Verify fold 1 is balanced in testing
table(data[folds[[1]]$testing])
#' # Verify fold 2 is balanced in training
table(data[folds[[2]]$training])
# Verify fold 2 is balanced in testing
table(data[folds[[2]]$testing])

folds <- cv_random_strata(iris$Species, 10, 0.5)
# List with indices of training and testing of fold 1
folds[[1]]
# List with indices of training and testing of fold 2
folds[[2]]
folds[[3]]
# ...
folds[[30]]

## End(Not run)


brandon-mosqueda/SKM documentation built on Feb. 8, 2025, 5:24 p.m.