| rSplit | R Documentation |
Random split sampling, stratified based on the type of the response.
rSplit(y, nsplit, stratify = TRUE, s_ratio = 0.8, ...)
y |
a double vector,
a logical vector,
a factor,
or a Surv object,
response |
nsplit |
positive integer scalar, number of replicates of random splits to be performed |
stratify |
logical scalar,
whether stratification based on response |
s_ratio |
double scalar between 0 and 1,
split ratio, i.e., percentage of training subjects |
... |
additional parameters, currently not in use |
Function rSplit performs random split sampling, with or without stratification. Specifically,
If stratify = FALSE,
or if we have a double response y,
then split the sample into a training and a test set by odds p/(1-p), without stratification.
Otherwise, split a Surv response y, stratified by its censoring status.
Specifically,
split subjects with observed event into a training and a test set by odds p/(1-p),
and split the censored subjects into a training and a test set by odds p/(1-p).
Then combine the training sets from subjects with observed events and censored subjects,
and combine the test sets from subjects with observed events and censored subjects.
Otherwise, split a logical response y, stratified by itself.
Specifically,
split the subjects with TRUE response into a training and a test set by odds p/(1-p),
and split the subjects with FALSE response into a training and a test set by odds p/(1-p).
Then combine the training sets, and the test sets, in a similar fashion as described above.
Otherwise, split a factor response y, stratified by its levels.
Specifically,
split the subjects in each level of y into a training and a test set by odds p/(1-p).
Then combine the training sets, and the test sets, from all levels of y.
Function rSplit returns a length-nsplit list of
logical vectors.
In each logical vector,
the TRUE elements indicate training subjects and
the FALSE elements indicate test subjects.
caTools::sample.split is not what we need.
split, caret::createDataPartition
rSplit(y = rep(c(TRUE, FALSE), times = c(20, 30)), nsplit = 3L)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.