RUS: Random Under-Sampling Algorithm

View source: R/RUS.R

RUSR Documentation

Random Under-Sampling Algorithm

Description

A balanced dataset would be return by using random under-sampling (RUS) algorithm.

Usage

RUS(data, outcome, perc_min = 100)

Arguments

data

A dataset containing the predictors and the outcome. The predictors can be continuous (numeric or integer) or catigorical (character or factor). The outcome must be binary.

outcome

The column number or the name of the outcome variable in the dataset.

perc_min

The desired percentage of the size of minority samples that the majority samples would be reached in the new dataset. The default is 100.

Details

The random under-sampling algorithm randomly chooses the majority samples without replacement according to the sample size of minority class, in order to get a more balanced dataset.

Value

A new dataset has been balanced.

Examples

data(abalone)
table(abalone$Class)

newdata1 <- RUS(abalone, 'Class')
table(newdata1$Class)

newdata2 <- RUS(abalone, 'Class', perc_min=200)
table(newdata2$Class)

dongyuanwu/RSBID documentation built on May 20, 2024, 7:53 a.m.