Generate cross-validated test-training pairs

Share:

Description

crossv_kfold splits the data into k exclusive partitions, and uses each partition for a test-training split. crossv_mc generates n random partitions, holding out p of the data for training.

Usage

1
2
3
crossv_mc(data, n, test = 0.2, id = ".id")

crossv_kfold(data, k = 5, id = ".id")

Arguments

data

A data frame

n

Number of test-training pairs to generate (an integer).

test

Proportion of observations that should be held out for testing (a double).

id

Name of variable that gives each model a unique integer id.

k

Number of folds (an integer).

Value

A data frame with n/k rows and columns test and train. test and train are list-columns containing resample objects.

Examples

1
2
3
4
5
6
7
8
cv1 <- crossv_kfold(mtcars, 5)
cv1

library(purrr)
cv2 <- crossv_mc(mtcars, 100)
models <- map(cv2$train, ~ lm(mpg ~ wt, data = .))
errs <- map2_dbl(models, cv2$test, rmse)
hist(errs)