TL: Tomek Links.

Description Usage Arguments Details Value References

Description

TL is a cleaning algorithm that removes examples that belong to Tomek Links. A pair of examples form a Tomek Link if they belong to different classes and are the nearest neighbours of each other.

Usage

1
TL(data, remove_class = "Majority", classes = NULL)

Arguments

data

A data frame containing the predictors and the outcome. The predictors must be numeric and the outcome must be both a binary valued factor and the last column of data.

remove_class

Examples from remove_class are removed. The options are: c("Majority", "Minority", "Both").

classes

A named vector identifying the majority and the minority classes. The names must be "Majority" and "Minority". This argument is only useful if the function is called inside another sampling function.

Details

The user has control over the examples that are removed and can select to remove only examples from majority class (under-sampling), minority class, and from both classes (cleaning).

Value

A data frame containing a clean version of the input data set after removing examples that belong to Tomek Links.

References

Tomek, I. (1976). An experiment with the edited nearest-neighbor rule. IEEE Transactions on systems, Man, and Cybernetics, (6), 448-452.


RomeroBarata/bimba documentation built on May 17, 2019, 8:03 a.m.