View source: R/class_AUC_roc.R
AUC_roc | R Documentation |
The AUC estimates the area under the receiver operator curve (ROC) for a nominal/categorical predicted-observed dataset using the Mann-Whitney U-statistic.
AUC_roc(data = NULL, obs, pred, tidy = FALSE, na.rm = TRUE)
data |
(Optional) argument to call an existing data frame containing the data. |
obs |
Vector with observed values (character | factor). |
pred |
Vector with predicted values (character | factor). |
tidy |
Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list (default). |
na.rm |
Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE. |
The AUC tests whether positives are ranked higher than negatives. It is simply the area under the ROC curve.
The AUC estimation of this function follows the procedure described by
Hand & Till (2001). The AUC_roc estimated following the trapezoid approach is
equivalent to the average between recall and specificity (Powers, 2011), which is
equivalent to the balanced accuracy (balacc
):
AUC_roc = \frac{(recall - FPR + 1)}{2} = \frac{recall+specificity}{2} = 1-\frac{FPR+FNR}{2}
Interpretation: the AUC is equivalent to the probability that a randomly case from a given class (positive for binary) will have a smaller estimated probability of belonging to another class (negative for binary) compared to a randomly chosen member of the another class.
Values: the AUC is bounded between 0 and 1. The closer to 1 the better. Values close to 0 indicate inaccurate predictions. An AUC = 0.5 suggests no discrimination ability between classes; 0.7 < AUC < 0.8 is considered acceptable, 0.8 < AUC < 0.5 is considered excellent, and AUC > 0.9 is outstanding (Mandrekar, 2010).
For the multiclass cases, the AUC is equivalent to the average of AUC of each class (Hand & Till, 2001).
Finally, the AUC is directly related to the Gini-index (a.k.a. G1) since Gini + 1 = 2*AUC. (Hand & Till, 2001).
For the formula and more details, see online-documentation
an object of class numeric
within a list
(if tidy = FALSE) or within a
data frame
(if tidy = TRUE).
Hanley, J.A., McNeil, J.A. (2017). The meaning and use of the area under a receiver operating characteristic (ROC) curve. _ Radiology 143(1): 29-36_ \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1148/radiology.143.1.7063747")}
Hand, D.J., Till, R.J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. _ Machine Learning 45: 171-186_ \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1023/A:1010920819831")}
Mandrekar, J.N. (2010). Receiver operating characteristic curve in diagnostic test assessment. _ J. Thoracic Oncology 5(9): 1315-1316_ \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1097/JTO.0b013e3181ec173d")}
Powers, D.M.W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies 2(1): 37–63. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2010.16061")}
set.seed(123)
# Two-class
binomial_case <- data.frame(labels = sample(c("True","False"), 100,
replace = TRUE), predictions = sample(c("True","False"), 100, replace = TRUE))
# Multi-class
multinomial_case <- data.frame(labels = sample(c("Red","Blue", "Green"), 100,
replace = TRUE), predictions = sample(c("Red","Blue", "Green"), 100, replace = TRUE) )
# Get AUC estimate for two-class case
AUC_roc(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
# Get AUC estimate for multi-class case
AUC_roc(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.