balancedSampsize: Balanced Sample Size

View source: R/balancedSampsize.R

balancedSampsizeR Documentation

Balanced Sample Size

Description

Create a vector of balanced (equal) sample sizes for use in the sampsize argument of rfPermute or randomForest for a classification model. The values are derived from a percentage of the smallest class sample size.

Usage

balancedSampsize(y, pct = 0.5)

Arguments

y

character, numeric, or factor vector containing classes of response variable. Values will be treated as unique for computing class frequencies.

pct

percent of smallest class frequency for sampsize vector.

Value

a named vector of sample sizes as long as the number of classes.

Author(s)

Eric Archer eric.archer@noaa.gov

Examples

data(mtcars)

# A balanced model with default half of smallest class size
sampsize_0.5 <- balancedSampsize(mtcars$am)
sampsize_0.5

rfPermute(factor(am) ~ ., mtcars, replace = FALSE, sampsize = sampsize_0.5)

# A balanced model with one quarter of smallest class size
sampsize_0.25 <- balancedSampsize(mtcars$am, pct = 0.25)
sampsize_0.25

rfPermute(factor(am) ~ ., mtcars, replace = FALSE, sampsize = sampsize_0.25)



rfPermute documentation built on Aug. 24, 2023, 1:08 a.m.