Description Usage Arguments Details Value References Examples
Function metric_auc
computes the AUROC
(Area Under the Receiver Operating Characteristic Curve)
and the AUPRC
(Area Under the Precision Recall Curve), measures of goodness
of a ranking in a binary classification problem.
Partial areas are also supported.
Important: the higher ranked classes are assumed to
ideally target positives (label = 1
) whereas
lower ranks correspond to negatives (label = 0
).
Function metric_fun
is a wrapper on metric_auc
that
returns a function for performance evaluation. This function takes
as input actual and predicted values and outputs a performance metric.
This is needed for functions such as perf
and perf_eval
, which iterate over a
list of such metric functions and return the performance
measured through each of them.
1 2 3 4 5 6 7 8 9 | metric_auc(
actual,
predicted,
curve = "ROC",
partial = c(0, 1),
standardized = FALSE
)
metric_fun(...)
|
actual |
numeric, binary labels of the negatives ( |
predicted |
numeric, prediction used to rank the entities - this will typically be the diffusion scores |
curve |
character, either |
partial |
vector with two numeric values for computing partial
areas. The numeric values are the limits in the |
standardized |
logical, should partial areas be standardised
to range in [0, 1]? Defaults to |
... |
parameters to pass to |
The AUROC is a scalar value: the probability of a randomly chosen positive having a higher rank than a randomly chosen negative. AUROC is cutoff-free and an informative of the performance of a ranker. Likewise, AUPRC is the area under the Precision-Recall curve and is also a standard metric for binary classification. Both measures can be found in [Saito, 2017].
AUROC and AUPRC have their partial counterparts, in which only the area enclosed up to a certain false positive rate (AUROC) or recall (AUPRC) is accounted for. This can be useful when assessing the goodness of the ranking, focused on the top entities.
The user can, however, define his or her custom performance
metric. AUROC and AUPRC are common choices, but other
problem-specific metrics might be of interest.
For example, number of hits in the top k nodes.
Machine learning metrics can be found in packages such as
Metrics
and MLmetrics
from the CRAN repository
(http://cran.r-project.org/).
metric_auc
returns a numeric value, the
area under the specified curve
metric_fun
returns a function (performance metric)
Saito, T., & Rehmsmeier, M. (2017). Precrec: fast and accurate precision–recall and ROC curve calculations in R. Bioinformatics, 33(1), 145-147.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | # generate class and numeric ranking
set.seed(1)
n <- 50
actual <- rep(0:1, each = n/2)
predicted <- ifelse(
actual == 1,
runif(n, min = 0.2, max = 1),
runif(n, min = 0, max = 0.8))
# AUROC
metric_auc(actual, predicted, curve = "ROC")
# partial AUC (up until false positive rate of 10%)
metric_auc(
actual, predicted, curve = "ROC",
partial = c(0, 0.1))
# The same are, but standardised in (0, 1)
metric_auc(
actual, predicted, curve = "ROC",
partial = c(0, 0.1), standardized = TRUE)
# AUPRC
metric_auc(actual, predicted, curve = "PRC")
# Generate performance functions for perf and perf_eval
f_roc <- metric_fun(
curve = "ROC", partial = c(0, 0.5),
standardized = TRUE)
f_roc
f_roc(actual = actual, predicted = predicted)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.