AENN: All-k Edited Nearest Neighbors
In NoiseFiltersR: Label Noise Filters for Data Preprocessing in Classification

Description Usage Arguments Details Value References Examples

Similarity-based filter for removing label noise from a dataset as a preprocessing step of classification. For more information, see 'Details' and 'References' sections.

## S3 method for class 'formula'
AENN(formula, data, ...)

## Default S3 method:
AENN(x, k = 5, classColumn = ncol(x), ...)

`formula`	A formula describing the classification variable and the attributes to be used.
`data, x`	Data frame containing the tranining dataset to be filtered.
`...`	Optional parameters to be passed to other methods.
`k`	Total number of nearest neighbors to be used.
`classColumn`	Positive integer indicating the column which contains the (factor of) classes. By default, the last column is considered.

AENN applies the Edited Nearest Neighbor algorithm ENN for all integers between 1 and k on the whole dataset. At the end, any instance considered noisy by some ENN is removed.

An object of class filter, which is a list with seven components:

cleanData is a data frame containing the filtered dataset.
remIdx is a vector of integers indicating the indexes for removed instances (i.e. their row number with respect to the original data frame).
repIdx is a vector of integers indicating the indexes for repaired/relabelled instances (i.e. their row number with respect to the original data frame).
repLab contains the new labels for repaired instances.
parameters is a list containing the argument values.
call contains the original call to the filter.
extraInf is a character that includes additional interesting information not covered by previous items.

Tomek I. (1976, June): An Experiment with the Edited Nearest-Neighbor Rule, in Systems, Man and Cybernetics, IEEE Transactions on, vol.SMC-6, no.6, pp. 448-452.

# Next example is not run in order to save time
## Not run: 
data(iris)
out <- AENN(Species~.-Petal.Length,iris)
print(out)
identical(out$cleanData, iris[setdiff(1:nrow(iris),out$remIdx),])

## End(Not run)

Call:
AENN(formula = Species ~ . - Petal.Length, data = iris)

Parameters:
k: 5

Results:
Number of removed instances: 8 (5.333333 %)
Number of repaired instances: 0 (0 %)
[1] TRUE