nested_cv | R Documentation |
nested_cv
can be used to take the results of one resampling procedure
and conduct further resamples within each split. Any type of resampling
used in rsample
can be used.
nested_cv(data, outside, inside)
data |
A data frame. |
outside |
The initial resampling specification. This can be an already
created object or an expression of a new object (see the examples below).
If the latter is used, the |
inside |
An expression for the type of resampling to be conducted within the initial procedure. |
It is a bad idea to use bootstrapping as the outer resampling procedure (see the example below)
An tibble with nested_cv
class and any other classes that
outer resampling process normally contains. The results include a
column for the outer data split objects, one or more id
columns,
and a column of nested tibbles called inner_resamples
with the
additional resamples.
## Using expressions for the resampling procedures:
nested_cv(mtcars, outside = vfold_cv(v = 3), inside = bootstraps(times = 5))
## Using an existing object:
folds <- vfold_cv(mtcars)
nested_cv(mtcars, folds, inside = bootstraps(times = 5))
## The dangers of outer bootstraps:
set.seed(2222)
bad_idea <- nested_cv(mtcars,
outside = bootstraps(times = 5),
inside = vfold_cv(v = 3)
)
first_outer_split <- bad_idea$splits[[1]]
outer_analysis <- as.data.frame(first_outer_split)
sum(grepl("Volvo 142E", rownames(outer_analysis)))
## For the 3-fold CV used inside of each bootstrap, how are the replicated
## `Volvo 142E` data partitioned?
first_inner_split <- bad_idea$inner_resamples[[1]]$splits[[1]]
inner_analysis <- as.data.frame(first_inner_split)
inner_assess <- as.data.frame(first_inner_split, data = "assessment")
sum(grepl("Volvo 142E", rownames(inner_analysis)))
sum(grepl("Volvo 142E", rownames(inner_assess)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.