cv_partition: Partitions data for cross-validation.
In sortinghat: sortinghat

Description Usage Arguments Details Value Examples

For a vector of training labels, we return a list of cross-validation folds, where each fold has the indices of the observations to leave out in the fold. In terms of classification error rate estimation, one can think of a fold as a the observations to hold out as a test sample set.

1 2	cv_partition(y, num_folds = 10, hold_out = NULL, seed = NULL)

`y`	a vector of class labels to partition
`num_folds`	the number of cross-validation folds. Ignored if `hold_out` is not `NULL`. See Details.
`hold_out`	the hold-out size for cross-validation. See Details.
`seed`	optional random number seed for splitting the data for cross-validation

Either the hold_out size or num_folds can be specified. The number of folds defaults to 10, but if the hold_out size is specified, then num_folds is ignored.

We partition the vector y based on its length, which we treat as the sample size, n. If an object other than a vector is used in y, its length can yield unexpected results. For example, the output of length(diag(3)) is 9.

list the indices of the training and test observations for each fold.

library(MASS)
# The following three calls to \code{cv_partition} yield the same partitions.
set.seed(42)
cv_partition(iris$Species)
cv_partition(iris$Species, num_folds = 10, seed = 42)
cv_partition(iris$Species, hold_out = 15, seed = 42)