partition: Split data into training and testing sets

Description Usage Arguments Details Value See Also Examples

View source: R/partition.R

Description

Returns the row indices of x that should go to training or validation.

Usage

1
2
3
4
5
6
7
8
partition(
  x,
  type = "group holdout",
  p = 0.75,
  kfold = 5,
  groups = min(5, length(x)),
  returnTrain = TRUE
)

Arguments

x

A vector used for splitting data

type

Character. Type of partition. Valid values are "random holdout", "group holdout" or "kfold"

p

percentage of data that goes to training set (holdout). Only relevant if type = "random holdout" or type = "group holdout"

kfold

Number of folds for cross-validation. Only relevant if type = "kfold".

groups

For "group holdout" and when x is numeric, this is the number of breaks in the quantiles

returnTrain

Logical indicating whether training or validation indices should be returned. Default is TRUE.

Details

Three types of splits are currently implemented. "random holdout" randomly selects p percents of x for the training set. "group holdout" first groups x into groups quantiles and randomly samples within them (see createDataPartition) . "kfold" creates k folds where p percent of the data is used for training in each fold (see createFolds). This function is a wrapper around two functions of caret package: createDataPartition and createFolds

Value

List containing training or validation indices

See Also

createDataPartition

Examples

1
2
3
4
5
# sample_points is a SpatialPointsDataFrame calculated and saved from getSample
# Load it into memory
load(system.file("extdata/examples/sample_points.RData",package="foster"))

partition(sample_points$cluster, type = "kfold", kfold = 5)

mqueinnec/foster documentation built on March 28, 2021, 4:27 p.m.