ubNCL: Neighborhood Cleaning Rule

Description Usage Arguments Details Value References See Also Examples

Description

Neighborhood Cleaning Rule modifies the Edited Nearest Neighbor method by increasing the role of data cleaning. Firstly, NCL removes negatives examples which are misclassified by their 3-nearest neighbors. Secondly, the neighbors of each positive examples are found and the ones belonging to the majority class are removed.

Usage

1
ubNCL(X, Y, k = 3, verbose = TRUE)

Arguments

X

the input variables of the unbalanced dataset.

Y

the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.

k

the number of neighbours to use

verbose

print extra information (TRUE/FALSE)

Details

In order to compute nearest neighbors, only numeric features are allowed.

Value

The function returns a list:

X

input variables

Y

response variable

References

J. Laurikkala. Improving identification of difficult small classes by balancing class distribution. Artificial Intelligence in Medicine, pages 63-66, 2001.

See Also

ubBalance

Examples

1
2
3
4
5
6
7
8
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubNCL(X=input, Y= output)
newData<-cbind(data$X, data$Y)

Example output

Loading required package: mlr
Loading required package: ParamHelpers
Loading required package: foreach
Loading required package: doParallel
Loading required package: iterators
Loading required package: parallel
Number of instances removed from majority class with ENN: 2 	 Time needed: 0.03 
Number of instances removed from majority class after ENN: 62 	 Time needed: 0.07 
Number of instances removed from majority class: 64 

unbalanced documentation built on May 2, 2019, 7:01 a.m.