Borderline-SMOTE | R Documentation |
Generate synthetic positive instances using Borderline-SMOTE algorithm. The number of majority neighbor of each minority instance is used to divide minority instances into 3 groups; SAFE/DANGER/NOISE, only the DANGER are used to generate synthetic instances.
BLSMOTE(X,target,K=5,C=5,dupSize=0,method =c("type1","type2"))
X |
A data frame or matrix of numeric-attributed dataset |
target |
A vector of a target class attribute corresponding to a dataset X. |
K |
The number of nearest neighbors during sampling process |
C |
The number of nearest neighbors during calculating safe-level process |
dupSize |
The number or vector representing the desired times of synthetic minority instances over the original number of majority instances, 0 for duplicating until balanced |
method |
A parameter to indicate which type of Borderline-SMOTE presented in the paper is used |
data |
A resulting dataset consists of original minority instances, synthetic minority instances and original majority instances with a vector of their respective target class appended at the last column |
syn_data |
A set of synthetic minority instances with a vector of minority target class appended at the last column |
orig_N |
A set of original instances whose class is not oversampled with a vector of their target class appended at the last column |
orig_P |
A set of original instances whose class is oversampled with a vector of their target class appended at the last column |
K |
The value of parameter K for nearest neighbor process used for generating data |
K_all |
The value of parameter C for nearest neighbor process used for determining SAFE/DANGER/NOISE |
dup_size |
The maximum times of synthetic minority instances over original majority instances in the oversampling |
outcast |
Unavailable for this method |
eps |
Unavailable for this method |
method |
The name of oversampling method and type used for this generated dataset (BLSMOTE type1/2) |
Wacharasak Siriseriwan <wacharasak.s@gmail.com>
Han, H., Wang, W.Y. and Mao, B.H. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I (ICIC'05), De-Shuang Huang, Xiao-Ping Zhang, and Guang-Bin Huang (Eds.), Vol. Part I. Springer-Verlag, Berlin, Heidelberg, 2005. 878-887. DOI=http://dx.doi.org/10.1007/11538059_91
data_example = sample_generator(5000,ratio = 0.80)
genData = BLSMOTE(data_example[,-3],data_example[,3])
genData_2 = BLSMOTE(data_example[,-3],data_example[,3],K=7, C=5, method = "type2")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.