performance: Performance statistics for prediction

View source: R/performance.R

performanceR Documentation

Performance statistics for prediction

Description

Functions for computing performance statistics used for model evaluation.

performance() computes all of the following, which are also available via specific functions:

Given a 2 x 2 table with notation

Truth
Predicted Positive Negative
Positive A B
Negative C D

The metrics computed here are:

  • precision: A / (A + B)

  • recall: A / (A + C)

  • F1: 2 / (recall^{-1} + precision^{-1})

  • accuracy: (A + D) / (A + B + C + D), or correctly predicted / all

  • balanced_accuracy: mean(recall) for all categories

Usage

performance(data, truth, by_class = TRUE, ...)

precision(data, truth, by_class = TRUE, ...)

recall(data, truth, by_class = TRUE, ...)

f1_score(data, truth, by_class = TRUE, ...)

accuracy(data, truth, ...)

balanced_accuracy(data, ...)

Arguments

data

a table of predicted by truth, or vector of predicted labels

truth

vector of "true" labels, or if a table, 2 to indicate that the "true" values are in columns, or 1 if in rows.

by_class

logical; if TRUE, estimate performance score separately for each class, otherwise average across classes

...

not used

Value

named list consisting of the selected measure(s), where each element is a scalar if by_class = FALSE, or a vector named by class if by_class = TRUE.

References

Powers, D. (2007). "Evaluation: From Precision, Recall and F Factor to ROC, Informedness, Markedness and Correlation." Technical Report SIE-07-001, Flinders University.

Examples

## Data in Table 2 of Powers (2007)

lvs <- c("Relevant", "Irrelevant")
tbl_2_1_pred <- factor(rep(lvs, times = c(42, 58)), levels = lvs)
tbl_2_1_truth <- factor(c(rep(lvs, times = c(30, 12)),
                          rep(lvs, times = c(30, 28))),               
                        levels = lvs)
                        
performance(tbl_2_1_pred, tbl_2_1_truth)
performance(tbl_2_1_pred, tbl_2_1_truth, by_class = FALSE)
performance(table(tbl_2_1_pred, tbl_2_1_truth), by_class = TRUE)

precision(tbl_2_1_pred, tbl_2_1_truth)

recall(tbl_2_1_pred, tbl_2_1_truth)

f1_score(tbl_2_1_pred, tbl_2_1_truth)

accuracy(tbl_2_1_pred, tbl_2_1_truth)

balanced_accuracy(tbl_2_1_pred, tbl_2_1_truth)


quanteda/quanteda.classifiers documentation built on Oct. 20, 2023, 6:53 a.m.