Description Usage Arguments Value See Also Examples
The function removes randomly some instances from the majority (negative) class and keeps all instances in the minority (positive) class in order to obtain a more balanced dataset. It allows two ways to perform undersampling: i) by setting the percentage of positives wanted after undersampling (percPos method), ii) by setting the sampling rate on the negatives, (percUnder method). For percPos, "perc"has to be (N.1/N * 100) <= perc <= 50, where N.1 is the number of positive and N the total number of instances. For percUnder, "perc"has to be (N.1/N.0 * 100) <= perc <= 100, where N.1 is the number of positive and N.0 the number of negative instances.
1 |
X |
the input variables of the unbalanced dataset. |
Y |
the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1. |
perc |
percentage of sampling. |
method |
method to perform under sampling ("percPos", "percUnder"). |
w |
weights used for sampling the majority class, if NULL all majority instances are sampled with equal weights |
The function returns a list:
X |
input variables |
Y |
response variable |
id.rm |
index of instances removed |
1 2 3 4 5 6 7 8 | library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]
data<-ubUnder(X=input, Y= output, perc = 40, method = "percPos")
newData<-cbind(data$X, data$Y)
|
Loading required package: mlr
Loading required package: ParamHelpers
Loading required package: foreach
Loading required package: doParallel
Loading required package: iterators
Loading required package: parallel
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.