undersample_mindist: Undersample a dataset by iteratively removing the observation...

View source: R/undersample.R

undersample_mindistR Documentation

Undersample a dataset by iteratively removing the observation with the lowest total distance to its neighbors of the same class.

Description

Undersample a dataset by iteratively removing the observation with the lowest total distance to its neighbors of the same class.

Usage

undersample_mindist(data, cls, cls_col, m, ...)

Arguments

data

Dataset to undersample. Aside from cls_col, must be numeric.

cls

Class to be undersampled.

cls_col

Column containing class information.

m

Desired number of observations after undersampling.

...

Additional arguments passed to dist().

Value

An undersampled dataframe.

Examples

setosa <- iris[iris$Species == "setosa", ]
nrow(setosa)
undersamp <- undersample_mindist(setosa, "setosa", "Species", 50)
nrow(undersamp)

scutr documentation built on Nov. 18, 2023, 1:08 a.m.