Description Usage Arguments Details Value References
ADASYN
over-samples the input data using the Adaptive Synthetic
Sampling algorithm.
1 |
data |
A data frame containing the predictors and the outcome. The
predictors must be numeric and the outcome must be both a binary valued
factor and the last column of |
perc_min |
The desired % size of the minority class relative to the
whole data set. For instance, if |
perc_over |
% of examples to append to the input data set relative
to the size of the minority class. For instance, if |
k |
Number of nearest neighbours to compute for each example in the minority class. |
classes |
A named vector identifying the majority and the minority classes. The names must be "Majority" and "Minority". This argument is only useful if the function is called inside another sampling function. |
ADASYN is an adaptation of the SMOTE algorithm which focuses on
synthesising more examples for the minority examples that are considered
"hard" to learn. The learning hardness of a minority example is defined as
being proportional to the number of majority examples among the k
nearest neighbours of the minority example. There are two cases where
no examples are synthesised for a minority example. The first case is when
all k
nearest neighbours belong to the majority class and the
minority examples is considered to be noise. The second case is when all
k
nearest neighbours belong to the minority class and the minority
example is considered too easy to learn (learning hardness = 0).
Compared to ADASYN's original description, the current implementation has
a few differences. Firstly, the d_{th} parameter was dropped.
Secondly, the β parameter was replaced by perc_min
and
perc_over
parameters. The modification allows the user to synthesise
as many examples as wanted and β = 1 is equivalent to
perc_min
= 50 (balance the distribution of examples).
A data frame containing a more balanced version of the input data set after over-sampling it with ADASYN.
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Neural Networks, 2008. IJCNN 2008.(IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on (pp. 1322-1328). IEEE.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.