Description Usage Arguments Value Methods (by class) See Also Examples
holdout_frac
splits the data so that proportion size
is in test
set and 1 - size
is in the training set. Likewise, holdout_n
splits the data so that size
elements are in the test set and the
remainder are in the training set.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | holdout_frac(data, ...)
## S3 method for class 'data.frame'
holdout_frac(data, size = 0.3, K = 1L,
shuffle = TRUE, prob = NULL, ...)
## S3 method for class 'grouped_df'
holdout_frac(data, size = 0.3, K = 1L,
shuffle = TRUE, stratify = FALSE, prob = NULL, ...)
holdout_n(data, ...)
## S3 method for class 'data.frame'
holdout_n(data, size = 1L, K = 1L, shuffle = TRUE,
prob = NULL, ...)
## S3 method for class 'grouped_df'
holdout_n(data, size = 1L, K = 1L, shuffle = TRUE,
stratify = FALSE, prob = NULL, ...)
|
data |
A data frame |
... |
Arguments passed to methods. |
size |
For |
K |
Number of test/train splits to generate. |
shuffle |
If |
prob |
Probability weight that an element is in the |
stratify |
If |
A data frame with K
rows and the following columns:
A list of resample
objects. Training sets.
An integer vector of identifiers.
data.frame
: Split rows in a data frame into test and training
data sets.
grouped_df
: Splits within each group of a grouped data frame
into test and training sets if stratify = FALSE
. This ensures that the test and training
sets will have approximately equal proportions of each group in the training
and test sets. If stratify = TRUE
, then the groups are split into test and training sets.
data.frame
: Split rows in a data frame into test and training
data sets.
grouped_df
: Splits within each group of a grouped data frame
into test and training sets if stratify = FALSE
. This ensures that the test and training
sets will have approximately equal proportions of each group in the training
and test sets. If stratify = TRUE
, then the groups are split into test and training sets.
This function is similar to the modelr function
crossv_mc
, but with more features.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | # Example originally from modelr::crossv_mc
library("purrr")
library("dplyr")
# holdout three obs, repeat 10 times
cv1 <- holdout_n(mtcars, size = 3, K = 10)
models <- map(cv1$train, ~ lm(mpg ~ wt, data = .))
summary(map2_dbl(models, cv1$test, modelr::rmse))
# holdout two groups at a time in the test set
# repeat four times.
cv2 <- holdout_n(group_by(mtcars, cyl), size = 2, K = 4)
models <- map(cv2$train, ~ lm(mpg ~ wt, data = .))
summary(map2_dbl(models, cv2$test, modelr::rmse))
# stratified holdout
# holdout 1 obs each from each group. repeat 5 times.
cv3 <- holdout_n(group_by(mtcars, am), size = 1, K = 5, stratified = TRUE)
models <- map(cv3$train, ~ lm(mpg ~ wt, data = .))
summary(map2_dbl(models, cv3$test, modelr::rmse))
# Holdout fraction of the data
# holdout 30% of observations, repeat 10 times
cv4 <- holdout_frac(mtcars, size = 0.3, K = 10)
models <- map(cv4$train, ~ lm(mpg ~ wt, data = .))
summary(map2_dbl(models, cv4$test, modelr::rmse))
# holdout 30% of groups at a time in the test set
cv5 <- holdout_frac(group_by(mtcars, cyl), size = 0.3, K = 10)
models <- map(cv5$train, ~ lm(mpg ~ wt, data = .))
summary(map2_dbl(models, cv5$test, modelr::rmse))
# stratified holdout
# holdout 30% of obs within each group.
cv6 <- holdout_frac(group_by(mtcars, am), size = 0.3, K = 10, stratified = TRUE)
models <- map(cv6$train, ~ lm(mpg ~ wt, data = .))
summary(map2_dbl(models, cv6$test, modelr::rmse))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.