ROS: Random Over-Sampling Algorithm

View source: R/ROS.R

ROSR Documentation

Random Over-Sampling Algorithm

Description

A balanced dataset would be return by using random over-sampling (ROS) algorithm.

Usage

ROS(data, outcome, perc_maj = 100)

Arguments

data

A dataset containing the predictors and the outcome. The predictors can be continuous (numeric or integer) or catigorical (character or factor). The outcome must be binary.

outcome

The column number or the name of the outcome variable in the dataset.

perc_maj

The desired percentage of the size of majority samples that the minority samples would be reached in the new dataset. The default is 100.

Details

The random over-sampling algorithm generates new samples by randomly sampling the minority samples with replacement according to the sample size of majority class, in order to get a more balanced dataset.

Value

A new dataset has been balanced.

Examples

data(abalone)
table(abalone$Class)

newdata1 <- ROS(abalone, 'Class')
table(newdata1$Class)

newdata2 <- ROS(abalone, 'Class', perc_maj=50)
table(newdata2$Class)

dongyuanwu/RSBID documentation built on May 20, 2024, 7:53 a.m.