tidy.rsplit: Tidy Resampling Object
In topepo/rsample: General Resampling Infrastructure

Description Usage Arguments Details Value Examples

The 'tidy' function from the broom package can be used on 'rset' and 'rsplit' objects to generate tibbles with which rows are in the analysis and assessment sets.

## S3 method for class 'rsplit'
tidy(x, unique_ind = TRUE, ...)

## S3 method for class 'rset'
tidy(x, ...)

## S3 method for class 'vfold_cv'
tidy(x, ...)

## S3 method for class 'nested_cv'
tidy(x, ...)

`x`	A 'rset' or 'rsplit' object
`unique_ind`	Should unique row identifiers be returned? For example, if 'FALSE' then bootstrapping results will include multiple rows in the sample for the same row in the original data.
`...`	Not currently used.

Note that for nested resampling, the rows of the inner resample, named 'inner_Row', are *relative* row indices and do not correspond to the rows in the original data set.

A tibble with columns 'Row' and 'Data'. The latter has possible values "Analysis" or "Assessment". For 'rset' inputs, identification columns are also returned but their names and values depend on the type of resampling. 'vfold_cv' contains a column "Fold" and, if repeats are used, another called "Repeats". 'bootstraps' and 'mc_cv' use the column "Resample".

library(ggplot2)
theme_set(theme_bw())

set.seed(4121)
cv <- tidy(vfold_cv(mtcars, v = 5))
ggplot(cv, aes(x = Fold, y = Row, fill = Data)) + 
  geom_tile() + scale_fill_brewer()
  
set.seed(4121)
rcv <- tidy(vfold_cv(mtcars, v = 5, repeats = 2))
ggplot(rcv, aes(x = Fold, y = Row, fill = Data)) + 
  geom_tile() + facet_wrap(~Repeat) + scale_fill_brewer()
  
set.seed(4121)
mccv <- tidy(mc_cv(mtcars, times = 5))
ggplot(mccv, aes(x = Resample, y = Row, fill = Data)) + 
  geom_tile() + scale_fill_brewer() 
  
set.seed(4121)
bt <- tidy(bootstraps(mtcars, time = 5))
ggplot(bt, aes(x = Resample, y = Row, fill = Data)) + 
  geom_tile() + scale_fill_brewer()
  
dat <- data.frame(day = 1:30)
# Resample by week instead of day
ts_cv <- rolling_origin(dat, initial = 7, assess = 7, 
                        skip = 6, cumulative = FALSE)
ts_cv <- tidy(ts_cv)
ggplot(ts_cv, aes(x = Resample, y = factor(Row), fill = Data)) +
  geom_tile() + scale_fill_brewer()