ubBalance | R Documentation |
The function implements several techniques to re-balance or remove noisy instances in unbalanced datasets.
ubBalance(X, Y, type="ubSMOTE", positive=1, percOver=200, percUnder=200, k=5, perc=50, method="percPos", w=NULL, verbose=FALSE)
X |
the input variables of the unbalanced dataset. |
Y |
the response variable of the unbalanced dataset. |
type |
the balancing technique to use (ubOver, ubUnder, ubSMOTE, ubOSS, ubCNN, ubENN, ubNCL, ubTomek). |
positive |
the majority class of the response variable. |
percOver |
parameter used in ubSMOTE |
percUnder |
parameter used in ubSMOTE |
k |
parameter used in ubOver, ubSMOTE, ubCNN, ubENN, ubNCL |
perc |
parameter used in ubUnder |
method |
parameter used in ubUnder |
w |
parameter used in ubUnder |
verbose |
print extra information (TRUE/FALSE) |
The argument type can take the following values: "ubOver" (over-sampling), "ubUnder" (under-sampling), "ubSMOTE" (SMOTE), "ubOSS" (One Side Selection), "ubCNN" (Condensed Nearest Neighbor), "ubENN" (Edited Nearest Neighbor), "ubNCL" (Neighborhood Cleaning Rule), "ubTomek" (Tomek Link).
The function returns a list:
X |
input variables |
Y |
response variable |
id.rm |
index of instances removed if availble in the technique selected |
Dal Pozzolo, Andrea, et al. "Racing for unbalanced methods selection." Intelligent Data Engineering and Automated Learning - IDEAL 2013. Springer Berlin Heidelberg, 2013. 24-31.
ubRacing
, ubOver
, ubUnder
, ubSMOTE
, ubOSS
, ubCNN
, ubENN
, ubNCL
, ubTomek
library(unbalanced) data(ubIonosphere) n<-ncol(ubIonosphere) output<-ubIonosphere$Class input<-ubIonosphere[ ,-n] #balance the dataset data<-ubBalance(X= input, Y=output, type="ubSMOTE", percOver=300, percUnder=150, verbose=TRUE) balancedData<-cbind(data$X,data$Y)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.