compIndexFolds: Create testing and training folds for each response variable...

Description Usage Arguments Details Value References See Also Examples

View source: R/compIndexFolds.R

Description

TODO Split a data set into testing and training samples by leaving selectors out. If more than one response (i.e. dependent) variable is supplied, a different set of testing/training pairs is created for each of them. The training/testing samples leave one or more selectors completely out. Selectors can be something like individual locations, a certain time step or a combination of both. Basically, it can be anything since the selector variable is defined by the user and hence it can be compiled in such a way that it leaves out whatever the user wants.

Usage

1
compIndexFolds(x, selector, nbr = 1)

Arguments

x

An object of class gpm or data.frame

selector

The column name of the selector variable. Only relevant if use_selector is TRUE.

response

The column name(s) of the response variable(s)

resamples

The list of the resamples containing the individual row numbers (resulting from function resamplingsByVariable)

p

The fraction of each sample to be used for model training (default 0.75)

use_selector

Use the selector variable for splitting the samples into training or testing (default FALSE).

Details

The split into training and testing samples is realized by using the caret::createDataPartition function which preserves the frequency distribution of the individual response variable(s).

Value

A nested list with training and testing samples for each of the n resamplings.

References

The function uses functions from: Max Kuhn. Contributions from Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, Brenton Kenkel, the R Core Team, Michael Benesty, Reynald Lescarbeau, Andrew Ziem, Luca Scrucca, Yuan Tang and Can Candan. (2016). caret: Classification and Regression Training. https://CRAN.R-project.org/package=caret

See Also

resamplingsByVariable for creating n resamplings from the original dataset.

Examples

1
2
3
4
## Not run: 
#Not run

## End(Not run)

environmentalinformatics-marburg/gpm documentation built on July 11, 2020, 11:12 a.m.