pr_auc_embed: Area Under the Precision-Recall Curve for an Embedding

View source: R/auc.R

pr_auc_embedR Documentation

Area Under the Precision-Recall Curve for an Embedding

Description

Embedding quality measure.

Usage

pr_auc_embed(dm, labels)

Arguments

dm

Distance matrix of an embedding.

labels

Vector of labels for each observation in the dataset in the same order as the observations in the distance matrix.

Details

The PR curve plots precision (also known as positive predictive value, PPV) against recall (also known as the true positive rate). The area under the curve provides similar information compared to the area under the ROC curve, but may be more appropriate when classes are highly imbalanced.

This function calculates the PR curve N times, where N is the number of the observations. The label of the Nth observation is set as the positive class and then the other observations are ranked according to their distance from the Nth observation in the output coordinates (lower distances being better). Observations with the same label as the Nth observation count as positive observations. The final reported result is the average over all observations.

Perfect retrieval results in an AUC of 1. For random retrieval, the value is the proportion of the positive class labels for that curve.

Value

Area Under the Precision-Recall curve, averaged over each observation.

Note

Use of this function requires that the PRROC package be installed.

References

Keilwagen, J., Grosse, I., & Grau, J. (2014). Area under precision-recall curves for weighted and unweighted data. PloS One, 9(3), e92209.

Davis, J., & Goadrich, M. (2006, June). The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning (pp. 233-240). ACM.


jlmelville/sneer documentation built on Nov. 15, 2022, 8:13 a.m.