kWayStratifiedY: k-fold cross validation stratified on y, a splitFunction in...

View source: R/outOfSample.R

kWayStratifiedYR Documentation

k-fold cross validation stratified on y, a splitFunction in the sense of vtreat::buildEvalSets

Description

k-fold cross validation stratified on y, a splitFunction in the sense of vtreat::buildEvalSets

Usage

kWayStratifiedY(nRows, nSplits, dframe, y)

Arguments

nRows

number of rows to split (>1)

nSplits

number of groups to split into (<nRows,>1).

dframe

original data frame (ignored).

y

numeric outcome variable try to have equidistributed in each split.

Value

split plan

Examples


set.seed(23255)
d <- data.frame(y=sin(1:100))
pStrat <- kWayStratifiedY(nrow(d),5,d,d$y)
problemAppPlan(nrow(d),5,pStrat,TRUE)
d$stratGroup <- vtreat::getSplitPlanAppLabels(nrow(d),pStrat)
pSimple <- kWayCrossValidation(nrow(d),5,d,d$y)
problemAppPlan(nrow(d),5,pSimple,TRUE)
d$simpleGroup <- vtreat::getSplitPlanAppLabels(nrow(d),pSimple)
summary(tapply(d$y,d$simpleGroup,mean))
summary(tapply(d$y,d$stratGroup,mean))




WinVector/vtreat documentation built on Aug. 29, 2023, 4:49 a.m.