createFolds: Create testing and training folds for each response variable.

Description Usage Arguments Details Value References See Also Examples

Description

TODO Split a data set into testing and training samples by leaving selectors out. If more than one response (i.e. dependent) variable is supplied, a different set of testing/training pairs is created for each of them.

Usage

1
2
3
4
5
6
## S4 method for signature 'GPM'
createFolds(x, nbr = 10, nested_cv = FALSE)

## S4 method for signature 'data.frame'
createFolds(x, response, resamples, selector,
  nested_cv = FALSE, nbr = 1)

Arguments

x

An object of class gpm or data.frame

response

The column name(s) of the response variable(s)

resamples

The list of the resamples containing the individual row numbers (resulting from function resamplingsByVariable)

selector

The column name of the selector variable. Only relevant if use_selector is TRUE.

use_selector

Use the selector variable for splitting the samples into training or testing (default FALSE).

Details

The split into training and testing samples is realized by using the caret::createDataPartition function which preserves the frequency distribution of the individual response variable(s).

Value

A layer within the gpm object with the information on the n individual resamplings.

A nested list with training and testing samples for each of the n resamplings.

References

The function uses functions from: Max Kuhn. Contributions from Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, Brenton Kenkel, the R Core Team, Michael Benesty, Reynald Lescarbeau, Andrew Ziem, Luca Scrucca, Yuan Tang and Can Candan. (2016). caret: Classification and Regression Training. https://CRAN.R-project.org/package=caret

See Also

resamplingsByVariable for creating n resamplings from the original dataset.

Examples

1
2
3
4
## Not run: 
#Not run

## End(Not run)

environmentalinformatics-marburg/gpm documentation built on July 11, 2020, 11:12 a.m.