perms: Permutation Resampling

Description Usage Arguments Details Value Examples

View source: R/perm-test-funs.R

Description

A function for generating permuted datasets; where one can permute as many columns as desired. Stratified (i.e. group-based) shuffling can be achieved by specifying a column name for the strata argument. See details for a more complete description and guidance on usage.

Usage

1
2
3
4
5
6
7
8
perms(
  data = NULL,
  ...,
  strata = NULL,
  times = 25,
  apparent = FALSE,
  seed = NULL
)

Arguments

data

A data frame.

...

Column names in data to permute/shuffle; or one of the select_helpers.

strata

A discrete varible for stratified permutations.

times

Number of permutations.

apparent

A logical. Should a copy of the input data be returned?

seed

A numeric value used to set the RNG seed for reproducible permutations.

Details

This function was motivated by the rsample package which allows straightforward implementation of several common resampling methods (e.g. boostrap, K-fold crossvalidation). While the internal mechanisms of this function are quite different, the goal is to provide a function that works like rsample for permuted data. This function works well with the pipe. See magrittr for more details.

After using perms, one can compute permutation-based P-values or other statistics using any function, including custom functions, in a concise manner. The syntax and usage of this function is motivated by the tidy eval principles. Thus, you specify both the names of the columns to permute and the stratitfying variable as bare column names, not quoted names. The default number of permutations is aligned with the default number of bootstraps for rsample::bootstraps.

This function allows for easy integration with map functions for functional programming. See the examples for a use-case. Also, consider the using future_map equivalents for parallel computations.

Value

A data frame (tibble) where each row is a permuted version of the input data. The returned data frame has the added class perms which can be used by the summary generic for S3 methods dispatch.

Examples

1
2
3
4
5
6
iris %>%
  perms(Sepal.Length)

iris %>%
  perms(Sepal.Width, Sepal.Length) %>%
  dplyr::mutate(cor = purrr::map_dbl(data, ~with(., cor(Sepal.Width, Sepal.Length))))

mattwarkentin/sandbox documentation built on Jan. 29, 2020, 4:46 p.m.