samplesSplitting: Split Sample Indexes into Training and Test Partitions for...

samplesSplitsR Documentation

Split Sample Indexes into Training and Test Partitions for Cross-validation Taking Into Account Classes.

Description

samplesSplits Creates two lists of lists. First has training samples, second has test samples for a range of different cross-validation schemes.

splitsTestInfo creates a table for tracking the permutation, fold number, or subset of each set of test samples. Useful for column-binding to the predictions, once they are unlisted into a vector.

Usage

samplesSplits(
  samplesSplits = c("k-Fold", "Permute k-Fold", "Permute Percentage Split",
    "Leave-k-Out"),
  permutations = 100,
  folds = 5,
  percentTest = 25,
  leave = 2,
  outcome
)

splitsTestInfo(
  samplesSplits = c("k-Fold", "Permute k-Fold", "Permute Percentage Split",
    "Leave-k-Out"),
  permutations = 100,
  folds = 5,
  percentTest = 25,
  leave = 2,
  splitsList
)

Arguments

samplesSplits

Default: "k-Fold". One of "k-Fold", "Permute k-Fold", "Permute Percentage Split", "Leave-k-Out".

permutations

Default: 100. An integer. The number of times the samples are permuted before splitting (repetitions).

folds

Default: 5. An integer. The number of folds to which the samples are partitioned to. Only relevant if samplesSplits is "k-Fold" or "Permute k-Fold".

percentTest

Default: 25. A positive number between 0 and 100. The percentage of samples to keep for the test partition. Only relevant if samplesSplits is "Permute Percentage Split".

leave

Default: 2. An integer. The number of samples to keep for the test set in leave-k-out cross-validation. Only relevant if samplesSplits is "Leave-k-Out".

outcome

A factor vector or Surv object containing the samples to be partitioned.

splitsList

The return value of the function samplesSplits.

Value

For samplesSplits, two lists of the same length. First is training partitions. Second is test partitions.

For splitsTestInfoTable, a table with a subset of columns "permutation", "fold" and "subset", depending on the cross-validation scheme specified.

Examples


classes <- factor(rep(c('A', 'B'), c(15, 5)))
splitsList <-samplesSplits(permutations = 1, outcome = classes)
splitsList
splitsTestInfo(permutations = 1, splitsList = splitsList)

DarioS/ClassifyR documentation built on Dec. 19, 2024, 8:22 p.m.