Description Usage Arguments Details Value References
NRAS
removes minority examples that have the proportion of minority
examples among their k nearest neighbours below a threshold.
1 |
data |
A data frame containing the predictors and the outcome. The
predictors must be numeric and the outcome must be both a binary valued
factor and the last column of |
k |
Number of nearest neighbours to compute for each example in the minority class. |
classes |
A named vector identifying the majority and the minority classes. The names must be "Majority" and "Minority". This argument is only useful if the function is called inside another sampling function. |
theshold |
All minority examples where the proportion of minority
neighbours is below the |
NRAS fits a logistic regression model to the data and uses it to predict the probability of examples being part of the minority class. These probabilities are included as a new feature of the data and then the minority examples that have few minority examples as their neighbours are removed.
Note that the present implementation does not perform over-sampling with
SMOTE as in the original article. Here, the cleaning was decoupled from the
over-sampling to make NRAS usable with any other over-sampling algorithm.
Therefore, the sampling_sequence
function should be used in
conjunction with NRAS
to perform over-sampling using any
over-sampling algorithm.
A data frame containing a cleaned version of the input data after using the NRAS algorithm.
Rivera, W. A. (2017). Noise Reduction A Priori Synthetic Over-Sampling for class imbalanced data sets. Information Sciences, 408, 146-161.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.