tidy.rsplit: Tidy Resampling Object

Description Usage Arguments Details Value Examples

Description

The 'tidy' function from the broom package can be used on 'rset' and 'rsplit' objects to generate tibbles with which rows are in the analysis and assessment sets.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## S3 method for class 'rsplit'
tidy(x, unique_ind = TRUE, ...)

## S3 method for class 'rset'
tidy(x, ...)

## S3 method for class 'vfold_cv'
tidy(x, ...)

## S3 method for class 'nested_cv'
tidy(x, ...)

Arguments

x

A 'rset' or 'rsplit' object

unique_ind

Should unique row identifiers be returned? For example, if 'FALSE' then bootstrapping results will include multiple rows in the sample for the same row in the original data.

...

Not currently used.

Details

Note that for nested resampling, the rows of the inner resample, named 'inner_Row', are *relative* row indices and do not correspond to the rows in the original data set.

Value

A tibble with columns 'Row' and 'Data'. The latter has possible values "Analysis" or "Assessment". For 'rset' inputs, identification columns are also returned but their names and values depend on the type of resampling. 'vfold_cv' contains a column "Fold" and, if repeats are used, another called "Repeats". 'bootstraps' and 'mc_cv' use the column "Resample".

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
library(ggplot2)
theme_set(theme_bw())

set.seed(4121)
cv <- tidy(vfold_cv(mtcars, v = 5))
ggplot(cv, aes(x = Fold, y = Row, fill = Data)) + 
  geom_tile() + scale_fill_brewer()
  
set.seed(4121)
rcv <- tidy(vfold_cv(mtcars, v = 5, repeats = 2))
ggplot(rcv, aes(x = Fold, y = Row, fill = Data)) + 
  geom_tile() + facet_wrap(~Repeat) + scale_fill_brewer()
  
set.seed(4121)
mccv <- tidy(mc_cv(mtcars, times = 5))
ggplot(mccv, aes(x = Resample, y = Row, fill = Data)) + 
  geom_tile() + scale_fill_brewer() 
  
set.seed(4121)
bt <- tidy(bootstraps(mtcars, time = 5))
ggplot(bt, aes(x = Resample, y = Row, fill = Data)) + 
  geom_tile() + scale_fill_brewer()
  
dat <- data.frame(day = 1:30)
# Resample by week instead of day
ts_cv <- rolling_origin(dat, initial = 7, assess = 7, 
                        skip = 6, cumulative = FALSE)
ts_cv <- tidy(ts_cv)
ggplot(ts_cv, aes(x = Resample, y = factor(Row), fill = Data)) +
  geom_tile() + scale_fill_brewer()

topepo/rsample documentation built on May 4, 2019, 4:25 p.m.