View source: R/nearmiss_impl.R

nearmiss | R Documentation |

Generates synthetic positive instances using nearmiss algorithm.

```
nearmiss(df, var, k = 5, under_ratio = 1)
```

`df` |
data.frame or tibble. Must have 1 factor variable and remaining numeric variables. |

`var` |
Character, name of variable containing factor variable. |

`k` |
An integer. Number of nearest neighbor that are used to generate the new examples of the minority class. |

`under_ratio` |
A numeric value for the ratio of the minority-to-majority frequencies. The default value (1) means that all other levels are sampled down to have the same frequency as the least occurring level. A value of 2 would mean that the majority levels will have (at most) (approximately) twice as many rows than the minority level. |

All columns used in this function must be numeric with no missing data.

A data.frame or tibble, depending on type of `df`

.

Inderjeet Mani and I Zhang. knn approach to unbalanced data distributions: a case study involving information extraction. In Proceedings of workshop on learning from imbalanced datasets, 2003.

`step_nearmiss()`

for step function of this method

Other Direct Implementations:
`adasyn()`

,
`bsmote()`

,
`smotenc()`

,
`smote()`

,
`tomek()`

```
circle_numeric <- circle_example[, c("x", "y", "class")]
res <- nearmiss(circle_numeric, var = "class")
res <- nearmiss(circle_numeric, var = "class", k = 10)
res <- nearmiss(circle_numeric, var = "class", under_ratio = 1.5)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.