external_validity: External validity indices

external_validityR Documentation

External validity indices

Description

External validity indices compare a predicted clustering result with a reference class or gold standard.

Usage

ev_nmi(pred.lab, ref.lab, method = "emp")

ev_confmat(pred.lab, ref.lab)

Arguments

pred.lab

predicted labels generated by classifier

ref.lab

reference labels for the observations

method

method of computing the entropy. Can be any one of "emp", "mm", "shrink", or "sg".

Details

ev_nmi calculates the normalized mutual information

ev_confmat calculates a variety of statistics associated with confusion matrices. Accuracy, Cohen's kappa, and Matthews correlation coefficient have direct multiclass definitions, whereas all other metrics use macro-averaging.

Value

ev_nmi returns the normalized mutual information.

ev_confmat returns a tibble of the following summary statistics using yardstick::summary.conf_mat():

  • accuracy: Accuracy

  • kap: Cohen's kappa

  • sens: Sensitivity

  • spec: Specificity

  • ppv: Positive predictive value

  • npv: Negative predictive value

  • mcc: Matthews correlation coefficient

  • j_index: Youden's J statistic

  • bal_accuracy: Balanced accuracy

  • detection_prevalence: Detection prevalence

  • precision: alias for ppv

  • recall: alias for sens

  • f_meas: F Measure

Note

ev_nmi is adapted from infotheo::mutinformation()

Author(s)

Johnson Liu, Derek Chiu

References

Strehl A, Ghosh J. Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 2002;3:583-617.

Examples

set.seed(1)
E <- matrix(rep(sample(1:4, 1000, replace = TRUE)), nrow = 100, byrow =
              FALSE)
x <- sample(1:4, 100, replace = TRUE)
y <- sample(1:4, 100, replace = TRUE)
ev_nmi(x, y)
ev_confmat(x, y)

diceR documentation built on Sept. 29, 2023, 1:06 a.m.