Tomek Link

Share:

Description

The function finds the points in the dataset that are tomek link using 1-NN and then removes only majority class instances that are tomek links.

Usage

1
ubTomek(X, Y, verbose = TRUE)

Arguments

X

the input variables of the unbalanced dataset.

Y

the response variable of the unbalanced dataset. It must be a binary factor where the majority class is coded as 0 and the minority as 1.

verbose

print extra information (TRUE/FALSE)

Details

In order to compute nearest neighbors, only numeric features are allowed.

Value

The function returns a list:

X

input variables

Y

response variable

id.rm

index of instances removed

References

I. Tomek. Two modifications of cnn. IEEE Trans. Syst. Man Cybern., 6:769-772, 1976.

See Also

ubBalance

Examples

1
2
3
4
5
6
7
8
library(unbalanced)
data(ubIonosphere)
n<-ncol(ubIonosphere)
output<-ubIonosphere$Class
input<-ubIonosphere[ ,-n]

data<-ubTomek(X=input, Y= output)
newData<-cbind(data$X, data$Y)