SMOTEWB: SMOTE with boosting (SMOTEWB)

View source: R/SMOTEWB.R

SMOTEWBR Documentation

SMOTE with boosting (SMOTEWB)

Description

Resampling with SMOTE with boosting.

Usage

SMOTEWB(
  x,
  y,
  n_weak_classifier = 100,
  class_weights = NULL,
  k_max = NULL,
  n_needed = NULL,
  ...
)

Arguments

x

feature matrix.

y

a factor class variable with two classes.

n_weak_classifier

number of weak classifiers for boosting.

class_weights

numeric vector of length two. First number is for positive class, and second is for negative. Higher the relative weight, lesser noises for that class. By default, 2\times n_{neg}/n for positive and 2\times n_{pos}/n for negative class.

k_max

to increase maximum number of neighbors. Default is ceiling(n_neg/n_pos).

n_needed

vector of desired number of synthetic samples for each class. A vector of integers for each class. Default is NULL meaning full balance.

...

additional inputs for ada::ada().

Details

SMOTEWB (Saglam & Cengiz, 2022) is a SMOTE-based oversampling method which can handle noisy data and adaptively decides the appropriate number of neighbors to link during resampling with SMOTE.

Trained model based on this method gives significantly better Matthew Correlation Coefficient scores compared to others.

Can work with classes more than 2.

Value

a list with resampled dataset.

x_new

Resampled feature matrix.

y_new

Resampled target variable.

x_syn

Generated synthetic data.

y_syn

Generated synthetic data labels.

w

Boosting weights for original dataset.

k

Number of nearest neighbors for positive class samples.

C

Number of synthetic samples for each positive class samples.

fl

"good", "bad" and "lonely" sample labels

Author(s)

Fatih Saglam, saglamf89@gmail.com

References

Sağlam, F., & Cengiz, M. A. (2022). A novel SMOTE-based resampling technique trough noise detection and the boosting procedure. Expert Systems with Applications, 200, 117023.

Examples


set.seed(1)
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
           matrix(rnorm(100, 5, 1), ncol = 2, nrow = 50))
y <- as.factor(c(rep("negative", 1000), rep("positive", 50)))

plot(x, col = y)

# resampling
m <- SMOTEWB(x = x, y = y, n_weak_classifier = 150)

plot(m$x_new, col = m$y_new)



SMOTEWB documentation built on June 8, 2025, 11:57 a.m.

Related to SMOTEWB in SMOTEWB...