smote_and_undersample: SMOTE oversampling and undersampling

Description Usage Arguments Details Value See Also Examples

Description

Function to both oversample by SMOTE the minority class and undersample the majority class

Usage

1
smote_and_undersample(data, y, fp = 1, ratio = 1, k = 5)

Arguments

data

a data frame or matrix. Rows: examples; columns: features

y

a factor with the labels. 0:majority class, 1: minority class.

fp

multiplicative factor for the SMOTE oversampling of the minority class. If fp<1 no oversampling is performed.

ratio

ratio of the #majority/#minority

k

number of the nearest neighbours for SMOTE oversampling (def. 5)

Details

If n is the number of examples of the minority class, then fp*n new synthetic examples are generated according to the SMOTE algorithm and ratio*(fp*n + n) negative examples are undersampled form the majority class.

Value

A list with two entries:

X

a data frame including the original minority class examples plus the SMOTE oversampled and undersampled data

Y

a factor with the labels of the data frame

See Also

smote

Examples

1
2
d <- imbalanced.data.generator(n.pos=20, n.neg=1000, n.features=12, n.inf.features=2, sd=1, seed=1);
res <- smote_and_undersample(d$data, d$label, fp = 2, ratio = 3);

Example output



hyperSMURF documentation built on May 2, 2019, 9:20 a.m.