cv_base: Resamples for K-fold cross-validation with stratification by...

Description Usage Arguments Details Value Examples

View source: R/k_fold.R

Description

Creates resamples for K-fold cross-validation stratified by target variable.

Usage

1
cv_base(data, y, nfolds = 5L, probs = seq(0, 1, length.out = 11))

Arguments

data

data.table with target variable.

y

Target variable name (character).

nfolds

Number of folds (min 2, max 20).

probs

Numeric vector of probabilities for quantile binning with values in [0, 1] range.

Details

Numeric target: quantile binning is used for stratification. Character/categorical target: resampling performs within categories. probs can be a vector like c(0, seq(0.99, 1, length.out = 10)) for target with very skewed distribution, e.g. for financial data with 99% of 0's.

Value

data.table with nfolds columns. Each column is an indicator variable with 1 corresponds to observations in validation dataset (stratified by target).

Examples

1
cv_base(as.data.table(iris), "Species")

statist-bhfz/resampleR documentation built on Sept. 2, 2019, 8:14 p.m.