NRAS removes minority examples that have the proportion of minority
examples among their k nearest neighbours below a threshold.
A data frame containing the predictors and the outcome. The
predictors must be numeric and the outcome must be both a binary valued
factor and the last column of
Number of nearest neighbours to compute for each example in the minority class.
A named vector identifying the majority and the minority classes. The names must be "Majority" and "Minority". This argument is only useful if the function is called inside another sampling function.
All minority examples where the proportion of minority
neighbours is below the
NRAS fits a logistic regression model to the data and uses it to predict the probability of examples being part of the minority class. These probabilities are included as a new feature of the data and then the minority examples that have few minority examples as their neighbours are removed.
Note that the present implementation does not perform over-sampling with
SMOTE as in the original article. Here, the cleaning was decoupled from the
over-sampling to make NRAS usable with any other over-sampling algorithm.
sampling_sequence function should be used in
NRAS to perform over-sampling using any
A data frame containing a cleaned version of the input data after using the NRAS algorithm.
Rivera, W. A. (2017). Noise Reduction A Priori Synthetic Over-Sampling for class imbalanced data sets. Information Sciences, 408, 146-161.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.