knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.height = 5, fig.width = 8 )
Funnel plots are designed for comparisons of institutional performance, for the purpose of outlier discovery. These may be over or under performers. They graph institution performance y (on a metric of interest, e.g. hospital mortality) against a measure of institution size n (e.g. number of patients treated). Control limits are then overlayed, designed to capture (1-$\alpha$)100% of institutions (typically 95% and 99.7% - 2 and 3 standard deviations), from a target value (e.g. the global average) for the metric. The "funnel" shape comes from the expectation that with increasing institution size there will be less variation around the target, leading to narrower control limits as n grows. For an excellent introduction see Spiegelhalter (2005).
library(funnelplot) library(ggplot2)
funnelplot supports:
To illustrate the funnelplot package, we will use a synthetic example dataset of 40 hospitals. The binary outcome measure is whether a particular procedure was successful on the ith patient. We have two variables gender and age that may be distributed unevenly across the hospitals and play a role in explaining differences in the observed hospital procedure rate.
data("example_data", package = "funnelplot") head(example_data)
We first create an unadjusted funnel plot. The resulting plot shows the raw event proportions against the number of observations per cluster. There appear to be some outliers.
# outcome ~ covariates | cluster ID f1 <- funnelModel(test ~ 1 | hosp_id, data = example_data) plot(f1) + theme_funnel()
The plot returned by funnelplot::plot is a standard ggplot2 object, allowing a large degree of customisation to fit the end use.
pp <- plot(f1) class(pp) pp + theme_bw()
funnelplot contains other useful functions such as a summary
method (i.e. summary.funnelRes
) which can be used to print information about the outlying clusters, and any risk-adjustment model (see below).
summary(f1)
and outliers
, which returns a vector of the resulting outliers for use in further analyses.
outliers(f1)
Generally raw comparison of event rates between institutions runs the risk of failing to account for differences in the subpopulations within each cluster (e.g. a tendency for sicker patients to attend a particular hospital). To account for this a statistical model can be used to estimate the expected outcome for each observation in the data, allowing comparison to the observed outcome to obtain standardised rates per institution. Currently funnelplot uses logistic regression stats::glm
for risk adjustment. A standard R formula of the form outcome ~ covariates | clusterID
creates a risk adjusted funnel plot object.
f2 <- funnelModel(test ~ gender | hosp_id, data = example_data) plot(f2)
Concern of inadequate risk adjustment can be evaluated using evalCasemixAdj
which uses cross-validation to test the predictive ability of the model. A combination of a poorly predictive model and a large degree of over-dispersion indicates a need to think carefully about any inferences drawn from the funnel plot.
evalCasemixAdj(f2,method="cv",folds = 5)
The funnelplot hyperparameters - multiplicity adjustment and overdispersion - can be controlled through the control argument of funnelModel
. This takes either the pointTarget
or distTarget
task functions, with the names indicating whether the institutions are compared to a target that is a single value (pointTarget
) or to a distribution (distTarget
). More details on the difference are found in vignette(funnelplot-control)
with pointTarget
usually a good first approach.
f3 <- funnelModel(test ~ gender | hosp_id, pointTarget(limits = c(0.05,0.01)), data = example_data) plot(f3)
By default funnelplot produces risk adjusted rates, but in many circumstances standardised rates (of the form observed/expected) will be of more interest.
f4 <- funnelModel(test ~ gender | hosp_id, pointTarget(limits = c(0.05,0.01), standardised = TRUE), data = example_data) plot(f4)
Out of fold
f5 <- funnelModel(test ~ gender | hosp_id, pointTarget(limits = c(0.05,0.01), standardised = TRUE), adj = adjParams(method = "out_of_fold",nfolds = 5), data = example_data) plot(f5)
Using the label and identify arguments we can label or plot only a selection of hospitals if desired. These arguments also accept "outliers" as a value.
plot(f4,label="outliers")
plot(f4,identify=13,label=13)
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), 289-300.
Jones, H. E., Ohlssen, D. I., & Spiegelhalter, D. J. (2008). Use of the false discovery rate when comparing multiple health care providers. Journal of clinical epidemiology, 61(3), 232-240.
Spiegelhalter, D. J. (2005). Funnel plots for comparing institutional performance. Statistics in medicine, 24(8), 1185-1202.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.