View source: R/data_splitting.R
create_data_partition | R Documentation |
create_data_partition
creates one or more random data partitions into
training and test sets. create_kfolds
splits the data into k-folds (or
groups) with approximatly the same number of observations.
create_bootstrap
draws bootstrap samples.
create_data_partition(y, trt = NULL, p = 0.5, times = 1, groups = 5, replace = FALSE) create_kfolds(y, trt = NULL, k = 10, times = 1, groups = 5) create_bootstrap(y, times = 10)
y |
An atomic vector. |
trt |
An optional treatment variable. |
p |
The proportion of training observations. |
times |
The number of partitions to create. |
groups |
For numeric y, the number of breaks in the quantiles. |
replace |
Should sampling be done with replacement? |
k |
The number of folds. |
If y
is a factor, sampling is done within the levels of y
in an
attempt to balance the class distributions between the partitions. If y
is numeric, groups are first created based on the quantiles of its
distribution and then sampling is done within these groups.
If trt
is supplied, the data partitions are stratified by the treatment
variable.
Notice that in addition to create_bootstrap
, bootstrap samples can also
by created using create_data_partition
with p = 1
and
replace = TRUE
.
create_data_partition
and create_bootstrap
return a
matrix of row position integers corresponding to the training set and to the
bootstrap sample, respectively. create_kfolds
returns a matrix with
the row integers corresponding to the folds.
Leo Guelman leo.guelman@gmail.com
set.seed(545) r <- factor(sample(c(0,1), 1000, replace = TRUE)) t <- factor(sample(c(0,1), 1000, replace = TRUE)) df <- data.frame(r, t) trainIndex <- create_data_partition(df$r, df$t) dfTrain <- df[trainIndex, ] dfTest <- df[-trainIndex, ] table(df$r, df$t) table(dfTrain$r, dfTrain$t) table(dfTest$r, dfTest$t) # Create k-folds head(create_kfolds(r, t, times = 5)) # Create 10 bootstrap samples set.seed(1) x <- rnorm(100) xb <- create_bootstrap(x, times = 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.