perfInd: Compute performance indicators.

perfIndR Documentation

Compute performance indicators.

Description

Compute several indicators to describe the performance of a binary classifier, e.g. sensitivity, specificity, an estimate of the area under the receiver operating characteristic, the Gini coefficient etc. (see the return value).

Usage

perfInd(x, y = NULL, 
    negativeFirst = TRUE, 
    debug = FALSE)

Arguments

x

a 2x2 classification table having predictions in rows, and ground truth in columns (negative cases come first, by default, see the negativeFirst argument), or a vector of predicted values for each observation (coded, by default, such that negative cases come first, e.g. as 0 and 1, or FALSE and TRUE, or as a factor). If a row (and a column) is missing, a guess is made to make a 2x2 table from x. This requires x to have rows/columns named as 'TRUE' and 'FALSE' such that we would deduce which row/column is missing. See examples.

y

if x is a vector of predictions, y holds the vector of ground truth classification for each observation (coded, by default, such that negative cases come first, e.g. as 0 and 1, or FALSE and TRUE, or as a factor).

negativeFirst

if TRUE, negative cases come first in both rows and columns of x, or in vectors x and y

debug

if TRUE, debugs will be printed. If numeric of value greater than 1, verbose debugs will be produced.

Value

a list containing the following named entries:

table

classification table

tp

true positives

tn

true negatives

fn

false negatives

fp

false positives

sensitivity

sensitivity

specificity

specificity

npv

negative predicted value, i.e. true negatives over true negatives and false positives

ppv

positive predicted value, i.e. true positives over true positives and false negatives

wnpv

weighted negative predicted value (an obscure measure)

wppv

weighted positive predicted value

fpr

false positive rate

fnr

false negative rate

fdr

false discovery rate

accuracy

accuracy

f1

F1 score (2*TP/(2*TP+FN+FP))

f2

F2 score (5*TP/(5*TP+4*FN+FP))

f05

F0.5 score (1.25*TP/(1.25*TP+0.25*FN+FP))

correspondence

correspondence score (TP/(TP+FN+FP))

mcc

Matthews correlation coefficient

informedness

informedness

markedness

markedness

auc

AUC (area under the ROC curve estimated by interpolating the (0,0), (1-specificity, sensitivity) and (1, 1) points in the ROC space)

gini

the Gini index

n

number of observations classified << end

Author(s)

Tomas Sieger

References

Fawcett, Tom (2006). _An Introduction to ROC Analysis_. Pattern Recognition Letters 27 (8): 861874. doi:10.1016/j.patrec.2005.10.010. Powers, David M W (2011). _Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation._ Journal of Machine Learning Technologies ISSN: 2229-3981 & ISSN: 2229-399X, Volume 2, Issue 1, 2011, pp-37-63 Available online at http://www.bioinfo.in/contents.php?id=51

See Also

performance which gives many more measures

Examples

# example from https://en.wikipedia.org/w/index.php?title=Sensitivity_and_specificity&oldid=680316530
x<-matrix(c(20,10,180,1820),2)
print(x)
perfInd(x)

# compute perormance over a vector of predicted and true classification:
print(perfInd(c(0,0,1,1,1), c(0,0,0,1,1)))

# compare several measures over several classification results:
tbls<-list(matrix(c(98,2,2,8),2),matrix(c(8,2,2,8),2),matrix(c(80,20,2,8),2))
for (i in 1:length(tbls)) {
  m<-tbls[[i]]
  .pn(m)
  pi<-perfInd(m)
  with(pi,catnl(
    'acc: ',accuracy,
    ', sp+se: ',sensitivity+specificity,
    ', correspondence: ',correspondence,
    ', F1: ',f1,
    ', F2: ',f2,sep=''))
}

# make degenerated table proper 2x2 table
i1<-c(1,1,1,1)
i2<-c(1,1,2,2)
m<-table(i1==1,i2==1)
print(m) # this is 1x2 table
perfInd(m)$table # this is 2x2 table

tsieger/tsiMisc documentation built on Oct. 10, 2023, 10:24 p.m.