SMOTEWB | R Documentation |
Resampling with SMOTE with boosting.
SMOTEWB(
x,
y,
n_weak_classifier = 100,
class_weights = NULL,
k_max = NULL,
n_needed = NULL,
...
)
x |
feature matrix. |
y |
a factor class variable with two classes. |
n_weak_classifier |
number of weak classifiers for boosting. |
class_weights |
numeric vector of length two. First number is for
positive class, and second is for negative. Higher the relative weight,
lesser noises for that class. By default, |
k_max |
to increase maximum number of neighbors. Default is
|
n_needed |
vector of desired number of synthetic samples for each class. A vector of integers for each class. Default is NULL meaning full balance. |
... |
additional inputs for ada::ada(). |
SMOTEWB (Saglam & Cengiz, 2022) is a SMOTE-based oversampling method which can handle noisy data and adaptively decides the appropriate number of neighbors to link during resampling with SMOTE.
Trained model based on this method gives significantly better Matthew Correlation Coefficient scores compared to others.
Can work with classes more than 2.
a list with resampled dataset.
x_new |
Resampled feature matrix. |
y_new |
Resampled target variable. |
x_syn |
Generated synthetic data. |
y_syn |
Generated synthetic data labels. |
w |
Boosting weights for original dataset. |
k |
Number of nearest neighbors for positive class samples. |
C |
Number of synthetic samples for each positive class samples. |
fl |
"good", "bad" and "lonely" sample labels |
Fatih Saglam, saglamf89@gmail.com
Sağlam, F., & Cengiz, M. A. (2022). A novel SMOTE-based resampling technique trough noise detection and the boosting procedure. Expert Systems with Applications, 200, 117023.
set.seed(1)
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
matrix(rnorm(100, 5, 1), ncol = 2, nrow = 50))
y <- as.factor(c(rep("negative", 1000), rep("positive", 50)))
plot(x, col = y)
# resampling
m <- SMOTEWB(x = x, y = y, n_weak_classifier = 150)
plot(m$x_new, col = m$y_new)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.