pr.boot: Bootstrap Confidence Intervals for Precision-Recall Curves
In usefun: A Collection of Useful Functions by John

pr.boot

R Documentation

Bootstrap Confidence Intervals for Precision-Recall Curves

Description

This functions calculates bootstrap percentile CIs for PR curves using precrec. These can then be used in a plotting function, see example.

Usage

pr.boot(
  labels,
  preds,
  boot.n = 10000,
  boot.stratified = TRUE,
  alpha = 0.1,
  ...
)

Arguments

`labels`	(`numeric()`) Vector of responses/labels (only two classes/values allowed: cases/positive class = 1 and controls/negative class = 0)
`preds`	(`numeric()`) Vector of prediction values. Higher values denote positive class.
`boot.n`	(`numeric(1)`) Number of bootstrap resamples. Default: 10000
`boot.stratified`	(`logical(1)`) Whether the bootstrap resampling is stratified (same number of cases/controls in each replicate as in the original sample) or not. It is advised to use stratified resampling when classes from `labels` are imbalanced. Default: TRUE.
`alpha`	(`numeric(1)`) Confidence level for bootstrap percentile interval (between 0 and 1). Default is 0.1, corresponding to 90% confidence intervals.
`...`	Other parameters to pass on to precrec::evalmod, except `mode` (set to `rocpr`) and `raw_curves` (set to `TRUE`). For example `x_bins` indicates the minimum number of recall points on the x-axis.

Value

A tibble with columns:

recall: recall of original data
precision: precision of original data
low_precision: low value of the bootstrap confidence interval
high_precision: high value of the bootstrap confidence interval

References

Saito, Takaya, Rehmsmeier, Marc (2016). “Precrec: fast and accurate precision-recall and ROC curve calculations in R.” Bioinformatics, 33(1), 145–147. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btw570")}.

Examples

set.seed(42)
# imbalanced labels
labels = sample(c(0,1), 100, replace = TRUE, prob = c(0.8,0.2))
# predictions
preds = rnorm(100)

# get CIs for PR curve
pr_tbl = pr.boot(labels, preds, boot.n = 100, x_bins = 30) # default x_bin is 1000
pr_tbl

# draw PR curve + add the bootstrap percentile confidence bands
library(ggplot2)

pr_tbl |>
  ggplot(aes(x = recall, y = precision)) +
  geom_step() +
  ylim(c(0,1)) +
  geom_ribbon(aes(ymin = precision_low, ymax = precision_high), alpha = 0.2)

usefun documentation built on Sept. 15, 2024, 1:06 a.m.