Description Usage Arguments Details Value References Examples
Ensemble-based filter for removing label noise from a dataset as a preprocessing step of classification. For more information, see 'Details' and 'References' sections.
1 2 3 4 5 6  | 
formula | 
 A formula describing the classification variable and the attributes to be used.  | 
data, x | 
 data frame containing the tranining dataset to be filtered.  | 
... | 
 Optional parameters to be passed to other methods.  | 
nfolds | 
 number of folds in which the dataset is split.  | 
consensus | 
 logical. If TRUE, consensus voting scheme is used. If FALSE, majority voting scheme is applied.  | 
classColumn | 
 positive integer indicating the column which contains the (factor of) classes. By default, the last column is considered.  | 
Full description of the method can be looked up in the provided references.
Dataset is split in nfolds folds, an ensemble of three different base classifiers (C4.5, 1-KNN, LDA) is
built over every combination of nfolds-1 folds, and then tested on the other one. Finally, consensus
or majority voting scheme is applied to remove noisy instances.
An object of class filter, which is a list with seven components:
cleanData is a data frame containing the filtered dataset.
remIdx is a vector of integers indicating the indexes for
removed instances (i.e. their row number with respect to the original data frame).
repIdx is a vector of integers indicating the indexes for
repaired/relabelled instances (i.e. their row number with respect to the original data frame).
repLab is a factor containing the new labels for repaired instances.
parameters is a list containing the argument values.
call contains the original call to the filter.
extraInf is a character that includes additional interesting
information not covered by previous items.
Brodley C. E., Friedl M. A. (1996, May): Improving automated land cover mapping by identifying and eliminating mislabeled observations from training data. In Geoscience and Remote Sensing Symposium, 1996. IGARSS'96.'Remote Sensing for a Sustainable Future.', International (Vol. 2, pp. 1379-1381). IEEE.
Brodley C. E., Friedl M. A. (1996, August): Identifying and eliminating mislabeled training instances. In AAAI/IAAI, Vol. 1 (pp. 799-805).
Brodley C. E., Friedl M. A. (1999): Identifying mislabeled training data. Journal of Artificial Intelligence Research, 131-167.
1 2 3 4 5 6  | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.