createSCV: Create Stratified (Cross-)Validation Set

Description Usage Arguments Value Examples

Description

It creates stratified (cross-)validation folds from data according to formula, so that all levels of all factors in formula are as balancely scattered as possible. This avoids problem with simple un-stratified cross validation where a factor level presents only in the validation set but not in the training set.

Usage

1
createSCV(form, data, k = 5)

Arguments

form

A formula. The LHS of form (if exists) and all non-factors in the RHS will be removed before forming strata.

data

Data.

k

number of folds to form.

valid_prob

Percentage of data to appear in validation set. Only used in createSV.

Value

A numeric vector with entries indicating the fold it belongs to.

Examples

1
2
3
4
5
6
7
8
createSCV(~Type+Treatment,CO2)
createSV(~Type+Treatment,CO2)

set.seed(123)
f1=createSCV(~Type+Treatment,CO2)
set.seed(123)
f2=createSCV(conc~Type+Treatment+uptake,CO2)
identical(f1,f2) ## TRUE

kohleth/kmisc documentation built on May 20, 2019, 12:53 p.m.