ROS: The Random Over-Sampling algorithm.

Description Usage Arguments Details Value Examples

Description

ROS returns a more balanced version of a data set after application of the Random Over-Sampling algorithm.

Usage

1
ROS(data, perc_min = 50, perc_over = NULL, classes = NULL)

Arguments

data

A data frame containing the predictors and the outcome. The outcome must be both a binary valued factor and the last column of data.

perc_min

The desired % size of the minority class relative to the whole data set. For instance, if perc_min = 50 the returned data set is balanced. perc_min is ignored if perc_over is specified.

perc_over

% of examples to append to the input data set relative to the size of the minority class. For instance, if perc_over = 100 the minority class doubles in size. If specified, perc_min is ignored.

classes

A named vector identifying the majority and the minority classes. The names must be "Majority" and "Minority". This argument is only useful if the function is called inside another sampling function.

Details

The Random Over-Sampling algorithm works by appending randomly selected examples from the minority class (with replacement) to the original data set.

Value

A data frame containing a more balanced version of the input data set after application of the Random Over-Sampling algorithm.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
imb_data <- generate_imbalanced_data(num_examples = 200, 
                                     num_features = 2,
                                     imbalance_ratio = 5,
                                     noise_maj = 0,
                                     noise_min = 0,
                                     seed = 42)
 
table(imb_data$target)
table(ROS(imb_data, perc_min = 50)$target)    # Balance the classes
table(ROS(imb_data, perc_over = 100)$target)  # Double minority class size

RomeroBarata/bimba documentation built on May 17, 2019, 8:03 a.m.