Description Usage Arguments Value Details References
Functions to run (un)restricted sampling with or without replacement in a dataframe.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | within_rows(dataframe, cols = 1:ncol(dataframe), replace = FALSE,
FUN = base::sample)
within_columns(dataframe, cols = 1:ncol(dataframe), stratum = rep(1,
nrow(dataframe)), replace = FALSE, FUN = base::sample)
normal_rand(dataframe, cols = 1:ncol(dataframe), stratum = rep(1,
nrow(dataframe)), replace = FALSE, FUN = base::sample)
rows_as_units(dataframe, stratum = rep(1, nrow(dataframe)), replace = FALSE,
length.out = NULL)
columns_as_units(dataframe, cols = 1:ncol(dataframe), replace = FALSE,
length.out = NULL)
|
dataframe |
a dataframe with the data to be shuffled or resampled. |
cols |
columns of dataframe that should be selected to be resampled/shuffled. Defaults for all columns. |
replace |
(logical) should the data be permuted (FALSE) or resampled with replacement (TRUE) ? |
FUN |
function used for the sampling procedure. The default is |
stratum |
factor or integer vector that separates data in groups or strata. Randomizations will be performed within each level of the stratum. Needs at least two observations in each level. Default is a single-level stratum. |
length.out |
(integer) specifies the size of the resulting data set.
For columns_as_units, a data.frame with length.out columns will be returned, and for
rows_as_units, a data.frame with length.out rows will be returned.
Note that if length.out is larger than the relevant dimension, |
a dataframe with the same structure of those input in dataframe
with values randomized accordingly.
Each function performs as close as possible the corresponding options in Resampling Stats add-in for Excel (www.resample.com) for permutation (shuffling) and sampling with replacement (resampling) values in tabular data:
normal_rand
corresponds to the 'normal shuffle' and 'normal resample' option.
For shuffling (replace=FALSE
) the data is permuted over all cells of dataframe
.
For resampling (replace=TRUE
) data from any cell can be sampled and attributed to any other cell.
within_rows
and within_columns
correspond to the options with the same names.
The randomization is done within each row or column of dataframe
.
So for shuffling the values of each row/column are permuted independently and for
resampling the values are sampled independently from each row/column and attributed only
to cells of the row/column they were sampled.
rows_as_units
and columns_as_units
also correspond to the options with the same names.
Each row or column dataframe
is shuffled or resampled as whole.
Only the placement of rows and columns in the dataframe change. The values and their position within each row/column remains the same.
All functions assemble the randomized values in a dataframe
of the same configuration of the original. Columns that
were not selected to be randomized with argument cols
are then
bound to the resulting dataframe. The order and names of the rows and columns are preserved, except if length.out
is specified. In this case, the randomized rows/columns may be shifted to the end of the table.
When both stratum
and length.out
are used, the function will try to keep the proportion of each strata close to the original.
Statistics.com LCC. 2009. Resampling Stats Add-in for Excel User's Guide. http://www.resample.com/content/software/excel/userguide/RSXLHelp.pdf
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.