smote: Synthetic minority oversampling (SMOTE)

Description Usage Arguments Details Value Author(s)

View source: R/F_smote_oversampling.R

Description

Performs oversampling by creating new instances.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
smote(
  Variables,
  Classes,
  subset_use = NULL,
  k = 5,
  use_nearest = TRUE,
  proportions = 0.9,
  equalise_with_undersampling = FALSE,
  safe = FALSE
)

Arguments

Variables

the data.frame of independent variables that should be used to create new instances

Classes

the class labels in the prediction problem

subset_use

a specific subset only is used for the oversampling. If NULL, everything is used.

k

the number of neigbours for generation

use_nearest

should only the nearest neighbours be used? (very slow)

proportions

to which proportion (of the biggest class) should the classes be equalized

equalise_with_undersampling

should additional undersampling be performed?

safe

should a safe version of SMOTE be used?

Details

SMOTE is used to generate synthetic datapoints of a smaller class, for example to overcome the problem of imbalanced classes in classification.

Value

a list containing new independent variables data.frame and new class labels

Author(s)

Ilya Kozlovskiy, Konstantin Hopf konstantin.hopf@uni-bamberg.de


SmartMeterAnalytics documentation built on Aug. 18, 2020, 5:07 p.m.