samplify: Created a resampled tibble

Description Usage Arguments Details Value See Also Examples

View source: R/samplify.R

Description

samplify() creates a resampled tibble with virtual groups.

Usage

1
samplify(data, times, size, ..., replace = FALSE, key = ".sample")

Arguments

data

A tbl.

times

A single integer specifying the number of resamples. If the tibble is grouped, this is the number of resamples per group.

size

A single integer specifying the size of each resample. For a grouped data frame, this is also allowed to be an integer vector with size equal to the number of groups in data. This can be helpful when sampling without replacement when the number of rows per group is very different.

...

Not used.

replace

Whether or not to sample with replacement.

key

A single character specifying the name of the virtual group that is added.

Details

The following functions have special / interesting behavior when used with a resampled_df:

Value

A resampled_df with an extra group specified by the key.

See Also

collect.resampled_df()

Other virtual samplers: bootstrapify

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
library(dplyr)
library(broom)

samplify(iris, times = 3, size = 20)

iris %>%
  samplify(times = 3, size = 20) %>%
  summarise(per_strap_mean = mean(Petal.Width))

iris %>%
  group_by(Species) %>%
  samplify(times = 3, size = 20) %>%
  summarise(per_strap_species_mean = mean(Petal.Width))

# Alter the name of the group with `key`
# Materialize them with collect()
samps <- samplify(iris, times = 3, size = 5, key = ".samps")
collect(samps)

collect(samps, id = ".id", original_id = ".orig_id")

#----------------------------------------------------------------------------

# Be careful not to specify a `size` larger
# than one of your groups! This will throw an error.

iris_group_sizes_of_50_and_5 <- iris[1:55,] %>%
  group_by(Species) %>%
  group_trim()

count(iris_group_sizes_of_50_and_5, Species)

# size = 10 > min_group_size = 5
## Not run: 
iris_group_sizes_of_50_and_5 %>%
  samplify(times = 2, size = 10)

## End(Not run)

# Instead, pass a vector of sizes to `samplify()` if this
# structure is absolutely required for your use case.

# size of 10 for the first group
# size of 5 for the second group
# total number of rows is 10 * 2 + 5 * 2 = 30
iris_group_sizes_of_50_and_5 %>%
  samplify(times = 2, size = c(10, 5)) %>%
  collect()

strapgod documentation built on Sept. 20, 2019, 9:04 a.m.