KFoldXVal: K-fold Cross-Validation Samples

Description Usage Arguments Examples

View source: R/cross-validation.R

Description

This function takes a dataframe or vector and returns a list of indices for use in cross-validation.

Usage

1
KFoldXVal(dat, k = 2, prop = NA, seed = NA)

Arguments

dat

Data to split into cross-validation groups

k

Constant k for k-fold cross-validation; defaults to 2

prop

Proportion of data to include in the first group; if not provided, defaults to equal-sized groups. Remaining groups must be at least size 1, and are as nearly equally sized as possible, with any remainder included in the final group.

seed

Optional random seed to use for sampling

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Create some sample data
dat <- LETTERS

# Create a function to print the groups
print.kfcv <- function(q) {
  for (i in 1:length(q)) {print(dat[q[[i]]])}
}

# Default behavior is to create 2 equally-sized sample groups
q <- KFoldXVal(dat, seed=123)
print.kfcv(q)

# For unequal groups, the remainder goes into the last group
q <- KFoldXVal(dat, k=4, seed=123)
print.kfcv(q)

# prop is used to fix the size of the first group
q <- KFoldXVal(dat, prop=.75, seed=123)
print.kfcv(q)

# This may be freely combined with k, provided that there are
# sufficient observations that all groups are at least size 1
q <- KFoldXVal(dat, k=4, prop=.75, seed=123)
print.kfcv(q)

q <- KFoldXVal(dat, k=4, prop=.9, seed=123)
print.kfcv(q)

phively/wranglR documentation built on April 11, 2020, 5:12 a.m.