ubNCL: Neighborhood Cleaning Rule

View source: R/ubNCL.R

ubNCLR Documentation

Neighborhood Cleaning Rule

Description

Neighborhood Cleaning Rule modifies the Edited Nearest Neighbor method by increasing the role of data cleaning. Firstly, NCL removes negatives examples which are misclassified by their 3-nearest neighbors. Secondly, the neighbors of each positive examples are found and the ones belonging to the majority class are removed.

Usage

ubNCL(X, Y, k = 3, verbose = TRUE)

Arguments

X

the input variables of the unbalanced dataset.

Y

the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.

k

the number of neighbours to use

verbose

print extra information (TRUE/FALSE)

Details

In order to compute nearest neighbors, only numeric features are allowed.

Value

The function returns a list:

X

input variables

Y

response variable

References

J. Laurikkala. Improving identification of difficult small classes by balancing class distribution. Artificial Intelligence in Medicine, pages 63-66, 2001.

See Also

ubBalance

Examples

library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubNCL(X=input, Y= output)
newData<-cbind(data$X, data$Y)

dalpozz/unbalanced documentation built on June 3, 2022, 2:42 a.m.