resample_cv: Generate data resamples using cross validation

View source: R/resample_cv.R

resample_cvR Documentation

Generate data resamples using cross validation

Description

Generate data resamples using cross validation

Usage

resample_cv(data, ..., k = 3, n = 1)

Arguments

data

data.frame, the data to resample.

...

unquoted names of columns of .data to stratify by. Usually they are discrete variables.

k

integer, the number of cross-validation folds.

n

integer, the number of times to repeat the creation of k folds (n>1 means performing repeated cross validation).

Value

A tibble with columns

  • train : an object of class modelr::resample. It contains a pointer to .data and the indexes of the rows that are in the training set. To extract the training set, use as.data.frame(); to extract the row indexes use as.integer()

  • val : an object of class modelr::resample with the validation set = the fold that is not in the training set.

  • fold : integer, the fold index.

  • repet : integer, the repetition index.

Examples

resample_cv(mtcars, k=3)
resample_cv(mtcars, k=3, n=2)

# stratified cross-val
rs  <- resample_cv(mtcars, k=4)
rss <- resample_cv(mtcars, k=4, gear)
sapply(rs$train, function(x) {sum(data.frame(x)$gear==4)})
# = variable number of occurrence of gear==4 in the training portion
sapply(rss$train, function(x) {sum(data.frame(x)$gear==4)})
# = reliable number of gear==4 in the training portion

jiho/joml documentation built on Dec. 6, 2023, 5:50 a.m.