basefunctions: Shuffling and resampling functions

Description Usage Arguments Value Details References

Description

Functions to run (un)restricted sampling with or without replacement in a dataframe.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
within_rows(dataframe, cols = 1:ncol(dataframe), replace = FALSE,
  FUN = base::sample)

within_columns(dataframe, cols = 1:ncol(dataframe), stratum = rep(1,
  nrow(dataframe)), replace = FALSE, FUN = base::sample)

normal_rand(dataframe, cols = 1:ncol(dataframe), stratum = rep(1,
  nrow(dataframe)), replace = FALSE, FUN = base::sample)

rows_as_units(dataframe, stratum = rep(1, nrow(dataframe)), replace = FALSE,
  length.out = NULL)

columns_as_units(dataframe, cols = 1:ncol(dataframe), replace = FALSE,
  length.out = NULL)

Arguments

dataframe

a dataframe with the data to be shuffled or resampled.

cols

columns of dataframe that should be selected to be resampled/shuffled. Defaults for all columns.

replace

(logical) should the data be permuted (FALSE) or resampled with replacement (TRUE) ?

FUN

function used for the sampling procedure. The default is sample, and a new function zfsample is provided for sampling with fixed zeroes.

stratum

factor or integer vector that separates data in groups or strata. Randomizations will be performed within each level of the stratum. Needs at least two observations in each level. Default is a single-level stratum.

length.out

(integer) specifies the size of the resulting data set. For columns_as_units, a data.frame with length.out columns will be returned, and for rows_as_units, a data.frame with length.out rows will be returned. Note that if length.out is larger than the relevant dimension, replace must also be specified.

Value

a dataframe with the same structure of those input in dataframe with values randomized accordingly.

Details

Each function performs as close as possible the corresponding options in Resampling Stats add-in for Excel (www.resample.com) for permutation (shuffling) and sampling with replacement (resampling) values in tabular data:

All functions assemble the randomized values in a dataframe of the same configuration of the original. Columns that were not selected to be randomized with argument cols are then bound to the resulting dataframe. The order and names of the rows and columns are preserved, except if length.out is specified. In this case, the randomized rows/columns may be shifted to the end of the table.

When both stratum and length.out are used, the function will try to keep the proportion of each strata close to the original.

References

Statistics.com LCC. 2009. Resampling Stats Add-in for Excel User's Guide. http://www.resample.com/content/software/excel/userguide/RSXLHelp.pdf


Rsampling documentation built on May 2, 2019, 2:09 a.m.