ubNCL: Neighborhood Cleaning Rule

Description Usage Arguments Details Value References See Also Examples

View source: R/ubNCL.R

Description

Neighborhood Cleaning Rule modifies the Edited Nearest Neighbor method by increasing the role of data cleaning. Firstly, NCL removes negatives examples which are misclassified by their 3-nearest neighbors. Secondly, the neighbors of each positive examples are found and the ones belonging to the majority class are removed.

Usage

1
ubNCL(X, Y, k = 3, verbose = TRUE)

Arguments

X

the input variables of the unbalanced dataset.

Y

the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.

k

the number of neighbours to use

verbose

print extra information (TRUE/FALSE)

Details

In order to compute nearest neighbors, only numeric features are allowed.

Value

The function returns a list:

X

input variables

Y

response variable

References

J. Laurikkala. Improving identification of difficult small classes by balancing class distribution. Artificial Intelligence in Medicine, pages 63-66, 2001.

See Also

ubBalance

Examples

1
2
3
4
5
6
7
8
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubNCL(X=input, Y= output)
newData<-cbind(data$X, data$Y)

Example output

Loading required package: mlr
Loading required package: ParamHelpers
Loading required package: foreach
Loading required package: doParallel
Loading required package: iterators
Loading required package: parallel
Number of instances removed from majority class with ENN: 2 	 Time needed: 0.03 
Number of instances removed from majority class after ENN: 62 	 Time needed: 0.07 
Number of instances removed from majority class: 64 

unbalanced documentation built on May 29, 2017, 8:47 p.m.