undersample_mindist: Undersample a dataset by iteratively removing the observation...

Description Usage Arguments Value Examples

View source: R/undersample.R

Description

Undersample a dataset by iteratively removing the observation with the lowest total distance to its neighbors of the same class.

Usage

1
undersample_mindist(data, cls, cls_col, m, dist_calc = "euclidean")

Arguments

data

Dataset to undersample. Aside from cls_col, must be numeric.

cls

Class to be undersampled.

cls_col

Column containing class information.

m

Desired number of observations after undersampling.

dist_calc

Method for distance calculation. See dist().

Value

An undersampled dataframe.

Examples

1
2
3
4
setosa <- iris[iris$Species == "setosa", ]
nrow(setosa)
undersamp <- undersample_mindist(setosa, "setosa", "Species", 50)
nrow(undersamp)

scutr documentation built on June 24, 2021, 5:07 p.m.