RUS: Random Under-Sampling Algorithm
In dongyuanwu/RSBID: Resampling Strategies for Binary Imbalanced Datasets

RUS	R Documentation

Random Under-Sampling Algorithm

Description

A balanced dataset would be return by using random under-sampling (RUS) algorithm.

Usage

RUS(data, outcome, perc_min = 100)

Arguments

`data`	A dataset containing the predictors and the outcome. The predictors can be continuous (`numeric` or `integer`) or catigorical (`character` or `factor`). The outcome must be binary.
`outcome`	The column number or the name of the outcome variable in the dataset.
`perc_min`	The desired percentage of the size of minority samples that the majority samples would be reached in the new dataset. The default is 100.

Details

The random under-sampling algorithm randomly chooses the majority samples without replacement according to the sample size of minority class, in order to get a more balanced dataset.

Value

A new dataset has been balanced.

Examples

data(abalone)
table(abalone$Class)

newdata1 <- RUS(abalone, 'Class')
table(newdata1$Class)

newdata2 <- RUS(abalone, 'Class', perc_min=200)
table(newdata2$Class)

dongyuanwu/RSBID documentation built on May 20, 2024, 7:53 a.m.