rSplit | R Documentation |
Random split sampling, stratified based on the type of the response.
rSplit(y, nsplit, stratify = TRUE, s_ratio = 0.8, ...)
y |
a double vector,
a logical vector,
a factor,
or a Surv object,
response |
nsplit |
positive integer scalar, number of replicates of random splits to be performed |
stratify |
logical scalar,
whether stratification based on response |
s_ratio |
double scalar between 0 and 1,
split ratio, i.e., percentage of training subjects |
... |
additional parameters, currently not in use |
Function rSplit performs random split sampling, with or without stratification. Specifically,
If stratify = FALSE
,
or if we have a double response y
,
then split the sample into a training and a test set by odds p/(1-p)
, without stratification.
Otherwise, split a Surv response y
, stratified by its censoring status.
Specifically,
split subjects with observed event into a training and a test set by odds p/(1-p)
,
and split the censored subjects into a training and a test set by odds p/(1-p)
.
Then combine the training sets from subjects with observed events and censored subjects,
and combine the test sets from subjects with observed events and censored subjects.
Otherwise, split a logical response y
, stratified by itself.
Specifically,
split the subjects with TRUE
response into a training and a test set by odds p/(1-p)
,
and split the subjects with FALSE
response into a training and a test set by odds p/(1-p)
.
Then combine the training sets, and the test sets, in a similar fashion as described above.
Otherwise, split a factor response y
, stratified by its levels.
Specifically,
split the subjects in each level of y
into a training and a test set by odds p/(1-p)
.
Then combine the training sets, and the test sets, from all levels of y
.
Function rSplit returns a length-nsplit
list of
logical vectors.
In each logical vector,
the TRUE
elements indicate training subjects and
the FALSE
elements indicate test subjects.
caTools::sample.split
is not what we need.
split, caret::createDataPartition
rSplit(y = rep(c(TRUE, FALSE), times = c(20, 30)), nsplit = 3L)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.