themis-package: themis: Extra Recipes Steps for Dealing with Unbalanced Data

themis-packageR Documentation

themis: Extra Recipes Steps for Dealing with Unbalanced Data

Description

logo

A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 arXiv:1106.1813, BorderlineSMOTE 2005 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/11538059_91")} and ADASYN 2008 https://ieeexplore.ieee.org/document/4633969. Or by decreasing the number of majority cases using NearMiss 2003 https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf or Tomek link removal 1976 https://ieeexplore.ieee.org/document/4309452.

Author(s)

Maintainer: Emil Hvitfeldt emil.hvitfeldt@posit.co (ORCID)

Other contributors:

  • Posit Software, PBC [copyright holder, funder]

See Also

Useful links:


themis documentation built on Aug. 15, 2023, 1:05 a.m.