KFoldXVal: K-fold Cross-Validation Samples
In phively/wranglR: R Data Wrangler

Description Usage Arguments Examples

View source: R/cross-validation.R

This function takes a dataframe or vector and returns a list of indices for use in cross-validation.

1	KFoldXVal(dat, k = 2, prop = NA, seed = NA)

`dat`	Data to split into cross-validation groups
`k`	Constant k for k-fold cross-validation; defaults to 2
`prop`	Proportion of data to include in the first group; if not provided, defaults to equal-sized groups. Remaining groups must be at least size 1, and are as nearly equally sized as possible, with any remainder included in the final group.
`seed`	Optional random seed to use for sampling

# Create some sample data
dat <- LETTERS

# Create a function to print the groups
print.kfcv <- function(q) {
  for (i in 1:length(q)) {print(dat[q[[i]]])}
}

# Default behavior is to create 2 equally-sized sample groups
q <- KFoldXVal(dat, seed=123)
print.kfcv(q)

# For unequal groups, the remainder goes into the last group
q <- KFoldXVal(dat, k=4, seed=123)
print.kfcv(q)

# prop is used to fix the size of the first group
q <- KFoldXVal(dat, prop=.75, seed=123)
print.kfcv(q)

# This may be freely combined with k, provided that there are
# sufficient observations that all groups are at least size 1
q <- KFoldXVal(dat, k=4, prop=.75, seed=123)
print.kfcv(q)

q <- KFoldXVal(dat, k=4, prop=.9, seed=123)
print.kfcv(q)