resample_split: Generate train-val splits of the data

View source: R/resample_split.R

resample_splitR Documentation

Generate train-val splits of the data

Description

Generate train-val splits of the data

Usage

resample_split(data, ..., p = 0.8, n = 1)

Arguments

data

data.frame, the data to resample.

...

unquoted names of columns of .data to stratify by. Usually they are discrete variables.

p

in [0,1], the proportion of observations to use for training.

n

integer, number of repetitions of the split.

Value

A tibble with columns

  • train : an object of class modelr::resample. The training data.

  • val : an object of class modelr::resample. The validation data (i.e. the rows of .data not selected in the training data).

  • repet : integer, the repetition number.

Examples

resample_split(mtcars, p=0.7, n=5)

# stratify train-val by gear
rs  <- resample_split(mtcars, p=0.5, n=10)
rss <- resample_split(mtcars, p=0.5, n=10, gear)
sapply(rs$train, function(x) {sum(data.frame(x)$gear==4)})
# = variable number of occurrence of gear==4 in the training set
sapply(rss$train, function(x) {sum(data.frame(x)$gear==4)})
# = reliable number of gear==4 in the training set

jiho/joml documentation built on Dec. 6, 2023, 5:50 a.m.