R/xv00-utility.R

Defines functions balancedSplit

Documented in balancedSplit

# Copyright (C) Kevin R. Coombes, 2007-2012

###
### UTILITY.R
###


##-----------------------------------------------------------------------------
## Split a dataset into training and testing sets, keeping a designated
## factor balanced between the two sets.
##
## fac is the factor with respect to which the pieces should be balanced
## size is a number between 0 and 1, the fraction to be used for training
balancedSplit <- function(fac, size) {
    trainer <- rep(FALSE, length(fac))
    for (lev in levels(fac)) {
        N <- sum(fac==lev)
        wanted <- max(1, trunc(N*size))
        trainer[fac==lev][sample(N, wanted)] <- TRUE
    }
    trainer
}

Try the CrossValidate package in your browser

Any scripts or data that you put into this service are public.

CrossValidate documentation built on May 7, 2019, 3 a.m.