In oizin/funnelplot: Create Funnel Plots

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.height = 5,
  fig.width = 8
)

Funnel plots are designed for comparisons of institutional performance, for the purpose of outlier discovery. These may be over or under performers. They graph institution performance y (on a metric of interest, e.g. hospital mortality) against a measure of institution size n (e.g. number of patients treated). Control limits are then overlayed, designed to capture (1-$\alpha$)100% of institutions (typically 95% and 99.7% - 2 and 3 standard deviations), from a target value (e.g. the global average) for the metric. The "funnel" shape comes from the expectation that with increasing institution size there will be less variation around the target, leading to narrower control limits as n grows. For an excellent introduction see Spiegelhalter (2005).

library(funnelplot)
library(ggplot2)

funnelplot supports:

Binary outcomes: situations in which the outcome can be conceptualised as a Bernoulli trial with outcomes 0 "failure" and 1 "success".
Covariate risk adjustment: control for differences in the characteristics of observations within each cluster. This is done through using a statistical model to estimate the expected outcome for each observation. This is followed by cluster level calculation of a suitable comparison measure of the observed and expected outcomes.
Multiplicity adjustment: control for potentially high false positive rates in the comparison of large numbers of clusters using Bonferroni or false discovery rate (FDR) based corrections.
Overdispersion: In situations where there is large variation in performance between the clusters, and the outcome cannot be reliabilty predicted, one approach to avoiding labelling a large number of clusters as outliers is to adjust the control limits outward by an inflation factor (e.g. 20%). The inflation factor can be estimated using the data, with the ability to leave a certain percentage of the best and worst performing clusters left out of the calculation.

An example

To illustrate the funnelplot package, we will use a synthetic example dataset of 40 hospitals. The binary outcome measure is whether a particular procedure was successful on the ith patient. We have two variables gender and age that may be distributed unevenly across the hospitals and play a role in explaining differences in the observed hospital procedure rate.

data("example_data", package = "funnelplot")
head(example_data)

Unadjusted funnel plot

We first create an unadjusted funnel plot. The resulting plot shows the raw event proportions against the number of observations per cluster. There appear to be some outliers.

# outcome ~ covariates | cluster ID
f1 <- funnelModel(test ~ 1 | hosp_id, data = example_data)
plot(f1) +
  theme_funnel()

The plot returned by funnelplot::plot is a standard ggplot2 object, allowing a large degree of customisation to fit the end use.

pp <- plot(f1)
class(pp)
pp +
  theme_bw()

funnelplot contains other useful functions such as a summary method (i.e. summary.funnelRes) which can be used to print information about the outlying clusters, and any risk-adjustment model (see below).

summary(f1)

and outliers, which returns a vector of the resulting outliers for use in further analyses.

outliers(f1)

Risk adjustment

Generally raw comparison of event rates between institutions runs the risk of failing to account for differences in the subpopulations within each cluster (e.g. a tendency for sicker patients to attend a particular hospital). To account for this a statistical model can be used to estimate the expected outcome for each observation in the data, allowing comparison to the observed outcome to obtain standardised rates per institution. Currently funnelplot uses logistic regression stats::glm for risk adjustment. A standard R formula of the form outcome ~ covariates | clusterID creates a risk adjusted funnel plot object.

f2 <- funnelModel(test ~ gender | hosp_id, data = example_data)
plot(f2)

Concern of inadequate risk adjustment can be evaluated using evalCasemixAdj which uses cross-validation to test the predictive ability of the model. A combination of a poorly predictive model and a large degree of over-dispersion indicates a need to think carefully about any inferences drawn from the funnel plot.

evalCasemixAdj(f2,method="cv",folds = 5)

Adjusting for multiplicity and overdispersion

The funnelplot hyperparameters - multiplicity adjustment and overdispersion - can be controlled through the control argument of funnelModel. This takes either the pointTarget or distTarget task functions, with the names indicating whether the institutions are compared to a target that is a single value (pointTarget) or to a distribution (distTarget). More details on the difference are found in vignette(funnelplot-control) with pointTarget usually a good first approach.

f3 <- funnelModel(test ~ gender | hosp_id, pointTarget(limits = c(0.05,0.01)), data = example_data)
plot(f3)

Standardised rates

By default funnelplot produces risk adjusted rates, but in many circumstances standardised rates (of the form observed/expected) will be of more interest.

f4 <- funnelModel(test ~ gender | hosp_id, pointTarget(limits = c(0.05,0.01), standardised = TRUE), data = example_data)
plot(f4)

Altering the model

Out of fold

f5 <- funnelModel(test ~ gender | hosp_id, pointTarget(limits = c(0.05,0.01), standardised = TRUE), adj = adjParams(method = "out_of_fold",nfolds = 5), 
                  data = example_data)
plot(f5)

Plotting

Using the label and identify arguments we can label or plot only a selection of hospitals if desired. These arguments also accept "outliers" as a value.

plot(f4,label="outliers")

plot(f4,identify=13,label=13)

References

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), 289-300.

Jones, H. E., Ohlssen, D. I., & Spiegelhalter, D. J. (2008). Use of the false discovery rate when comparing multiple health care providers. Journal of clinical epidemiology, 61(3), 232-240.

Spiegelhalter, D. J. (2005). Funnel plots for comparing institutional performance. Statistics in medicine, 24(8), 1185-1202.

oizin/funnelplot documentation built on March 11, 2021, 2:58 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com