Description Terminology Basic Functions
rsample has functions to create variations of a data set that can be used to evaluate models or to estimate the sampling distribution of some statistic.
A **resample** is the result of a two-way split of a data set. For example, when bootstrapping, one part of the resample is a sample with replacement of the original data. The other part of the split contains the instances that were not contained in the bootstrap sample. The data structure 'rsplit' is used to store a single resample.
When the data are split in two, the portion that are used to estimate the model or calculate the statistic is called the **analysis** set here. In machine learning this is sometimes called the "training set" but this would be poorly named since it might conflict with any initial split of the original data.
Conversely, the other data in the split are called the **assessment** data. In bootstrapping, these data are often called the "out-of-bag" samples.
A collection of resamples is contained in an 'rset' object.
The main resampling functions are: [vfold_cv()], [bootstraps()], [mc_cv()], [rolling_origin()], and [nested_cv()].
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.