cv_kfold_strata | R Documentation |
Generates folds for the stratified k-fold cross validation where k mutually exclusive folds are generated and the training phase is done using k − 1 folds and the testing with the remaining one, which ensures all individuals are part of the testing once. Given a categorical variable this type of cross validation ensures each fold contains the same proportion of elements of each class, so it is a good option for balanced folds.
cv_kfold_strata(data, k = 5)
data |
( |
k |
( |
A list
with k
elements where each element is a named list
with the
elements training
wich includes the indices of those records to be part of
the training set and testing
wich includes the indices of those records to
be part of the testing set. Training and testing sets of each fold are
exhaustive and mutually exclusive.
## Not run:
# Generates 5 folds of 2 elements (10 / 5) in testing set
data <- c(rep("A", 10), rep("B", 20), rep("C", 30))
folds <- cv_kfold_strata(data, 5)
# Indices of training set in fold 1
folds[[1]]$training
# Indices of testing set in fold 1
folds[[1]]$testing
# Verify fold 1 is balanced in training
table(data[folds[[1]]$training])
# Verify fold 1 is balanced in testing
table(data[folds[[1]]$testing])
#' # Verify fold 2 is balanced in training
table(data[folds[[2]]$training])
# Verify fold 2 is balanced in testing
table(data[folds[[2]]$testing])
folds <- cv_kfold_strata(iris$Species, 30)
# List with indices of training and testing of fold 1
folds[[1]]
# List with indices of training and testing of fold 2
folds[[2]]
folds[[3]]
# ...
folds[[30]]
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.