plot_pr: Precision-Recall Curve(s)

View source: R/plot_pr.R

plot_prR Documentation

Precision-Recall Curve(s)

Description

This function plots PR curves for one or several classifiers.

Usage

plot_pr(
  obs,
  pred,
  pal_curves = "npg",
  title = ifelse(is.numeric(pred), "Precision-Recall Curve", "Precision-Recall Curves"),
  leg.txt = NULL,
  legend = "topright",
  hover = FALSE
)

Arguments

obs

Vector of observed outcomes. Must be dichotomous. Can be numeric, character, factor, or logical. If numeric, obs must be coded 1 or 0. If character or factor, a warning will be issued clarifying that the first level is assumed to be the reference.

pred

Vector of predicted values, or several such vectors organized into a data frame or list, optionally named. Must be numeric. Common examples include the probabilities output by a logistic model, or the expression levels of a particular biomarker.

pal_curves

String specifying the color palette to use when plotting multiple vectors. Options include "ggplot", all qualitative color schemes available in RColorBrewer, and the complete collection of ggsci palettes. Alternatively, a character vector of colors with length equal to the number of vectors in dat.

title

Optional plot title.

leg.txt

Optional legend title.

legend

Legend position. Must be one of "bottom", "left", "top", "right", "bottomright", "bottomleft", "topleft", or "topright".

hover

Show predictor name by hovering mouse over PR curve? If TRUE, the plot is rendered in HTML and will either open in your browser's graphic display or appear in the RStudio viewer.

Details

PR curves plot the precision (i.e., positive predictive value) against the recall (i.e., true positive rate/sensitivity) for a given classifier and vector of observations. The area under the PR curve (AUC) is a useful performance metric for binary classifiers, especially in cases of extreme class imbalance, which is typical in omic contexts (Saito & Rehmsmeier, 2015). The grey horizontal line represents the performance of a theoretical random classifier. Interpolations for tied pred values are computed using the nonlinear method of Davis & Goadrich (2006).

References

Davis, J. & Goadrich, M. (2006). The Relationship Between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning, pp. 223-240. New York: ACM.

Saito, T. & Rehmsmeier, M. (2015). The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLoS ONE, 10(3): e0118432.

Examples

y <- rbinom(1000, size = 1, prob = 0.1)
x1 <- rnorm(1000, mean = y)
plot_pr(obs = y, pred = x1)

x2 <- rnorm(1000, mean = y, sd = 2)
plot_pr(obs = y, pred = list("Better" = x1, "Worse" = x2))


dswatson/bioplotr documentation built on March 3, 2023, 9:43 p.m.