fbroc: A package for fast bootstrap analysis and comparison of ROC curves

Share:

Description

Fbroc enables the fast bootstrap analysis and comparison of ROC curves for simulation studies and shiny applications by using a fast algorithm where the cost of a single bootstrap replicate is O(n), with n denoting the number of observations. The algorithm is implemented in C++ to further increase the efficiency. On a typical desktop computer the time needed for the calculation of 100000 bootstrap replicates given 500 observations requires time on the order of magnitude of one second. The ROC curve as used shows the True Positive Rate (TPR) as a function of the False Positive Rate (FPR). The package also support the analysis of paired ROC curves, where we compare two predictions given for the same set of samples.

Important fbroc functions

boot.roc

Use boot.roc to bootstrap a ROC curve.

boot.paired.roc

Use boot.paired.roc to bootstrap two paired ROC curves.

conf

Calculate confidence regions for the ROC curve.

perf

Estimate performance and calculate confidence intervals.

Example Data

fbroc also contains the example data set roc.examples, which you can use to test the functionality of the package. This data set contains simulated data and not an real application.

Details

The algorithm works by first determining the critical thresholds of the ROC curve - cutoffs at which the curve changes directions. Each observation is then linked to the specific thresholds at which they first cause a change in the TPR or FPR. Calculating this link and directly bootstrapping that link allows us to construct the bootstrapped ROC curve very quickly. Since multiple observation can be linked to the same threshold, it is difficult to implement the algorithm efficiently in R. This is why fbroc implements it in C++.

When bootstrapping paired ROC curves, the packages takes care of using the same set of samples for both predictors in each iteration of the bootstrap. This preserves the correlation structure between both predictors.

All bootstrap confidence interval are based on the percentile method.

Notes

Package fbroc is still in an early development stage. Currently it supports bootstrapping the confidence region of single and paired ROC curves, as well as the AUC, partial AUC, the FPR at a fixed TPR and vice versa. More sophisticated bootstrap confidence interval calculation and improved documentation will be added at a later time.

References

Efron, B., & Tibshirani, R. (1998). An introduction to the bootstrap. Boca Raton, Fla: Chapman & Hall/CRC.

Donna Katzman McClish. (1989). Analyzing a Portion of the ROC Curve. Medical Decision Making, http://mdm.sagepub.com/content/9/3/190.abstract.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(roc.examples)
# work with a single ROC curves
result.boot <- boot.roc(roc.examples$Cont.Pred, roc.examples$True.Class, n.boot = 100)
plot(result.boot)
perf(result.boot, "auc")
perf(result.boot, "auc", conf.level = 0.99)
perf(result.boot, "tpr", conf.level = 0.95, fpr = 0.1)
conf(result.boot, steps = 10)
# work with paired ROC curves
result.boot <- boot.paired.roc(roc.examples$Cont.Pred, roc.examples$Cont.Pred.Outlier, 
                               roc.examples$True.Class, n.boot = 100)
plot(result.boot)
perf(result.boot, "auc")
perf(result.boot, "auc", conf.level = 0.99)
perf(result.boot, "tpr", conf.level = 0.95, fpr = 0.1)
conf(result.boot, steps = 10)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.