data_partition: Data partitioning function adapted from the caret package.

Description Usage Arguments Details Value Author(s) References

View source: R/data.r

Description

data_partition creates a test/training partition.

Usage

1
data_partition(y, p = 0.5, groups = min(5, length(y)))

Arguments

y

a vector of outcomes.

p

the percentage of data that goes to training

groups

for numeric y, the number of breaks in the quantiles (see below)

Details

The random sampling is done within the levels of y when y is a factor in an attempt to balance the class distributions within the splits.

For numeric y, the sample is split into groups sections based on percentiles and sampling is done within these subgroups. The number of percentiles is set via the groups argument.

Also, very small class sizes (<= 3) the classes may not show up in both the training and test data

Value

A vector of row position integers corresponding to the training data

Author(s)

adapted from createDataPartition function by Max Kuhn

References

http://caret.r-forge.r-project.org/splitting.html


adrtod/rchallenge documentation built on March 23, 2021, 7:43 a.m.