External validation using `pmcalibration`
In pmcalibration: Calibration Curves for Clinical Prediction Models

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Binary outcome

library(pmcalibration)

# simulate some data for vignette
set.seed(2345)
dat <- sim_dat(1000, a1 = -3, a3 = .3)

# show the first 3 columns (col 4 is the true linear predictor/LP)
head(dat[-4])

We have data with a binary outcome, y, and two 'predictor' variables, x1 and x2. Suppose we have an existing model for predicting y from x1 and x2 that is as follows

p(y = 1) = plogis( -3 + 1*x1 + 1*x2 )

To externally validate this model on this new data we need to calculate the predicted probabilities. We'll also extract the observed outcomes.

p <- plogis(with(dat, -3 + x1 + x2))
y <- dat$y

First we can check 'calibration-in-the-large' via the calibration intercept and slope.

logistic_cal(y = y, p = p)

The calibration-intercept suggests no particular bias with a point estimate not far off zero. The calibration slope suggests that predicted probabilities are too extreme. However, this logistic calibration enforces a linear relationship between logit transformed probabilities and the log odds of y = 1.

Below we use pmcalibration to fit a flexible calibration curve, allowing for a non-linear relationship between predicted and actual probabilities. This assesses 'moderate calibration' according to the hierarchy of Van Calster et al. (2016).

In the example below, we fit a calibration curve using mgcv::gam via a penalized thin plate regression spline (see ?mgcv::tprs). pmcalibration calculates various metrics from the absolute difference between the predicted probability and the actual probability (as estimated by the calibration curve). In this case 95% confidence intervals for these metrics are calculated via simulation based inference.

(cc <- pmcalibration(y = y, p = p, 
                     smooth = "gam", bs = "tp", 
                     k = 10, transf="logit",
                     ci = "sim", method="REML"))

The printed metrics can be interpreted as follows:

Eavg suggests that the average difference between prediction and actual probability of the outcome is 0.02 (or 2%) with a 95% CI of [0.009, 0.031].
E50 is the median difference between prediction and observed probability (inferred from calibration curve). 50% of differences are 0.013 or smaller.
E90 is the 90th percentile difference. 90% of differences are 0.024 or smaller.
Emax is the largest observed difference between predicted and observed probability. The model can be off by up to 0.21, with a broad confidence interval.
ECI is the average squared difference between predicted and observed probabilities (multiplied by 100). See Van Hoorde et al. (2015).

A quick and simple plot of the calibration curve, and 95% confidence interval, can be obtained via plot.

plot(cc)

Or one could use get_cc to extract data for plotting with method of your choice. The plot below also shows the distribution of predicted probabilities.

library(ggplot2)

pcc <- get_cc(cc)

ggplot(pcc, aes(x = p, y = p_c, ymin=lower, ymax=upper)) +
  geom_abline(intercept = 0, slope = 1, lty=2) +
  geom_line() +
  geom_ribbon(alpha = 1/2, fill="lightblue") +
  coord_cartesian(xlim=c(0,1), ylim=c(0,1)) +
  labs(x = "Predicted", y = "Estimated") +
  theme_bw(base_size = 14) +
  geom_histogram(data = data.frame(p = p), aes(x=p, y=after_stat(density)*.01),
                 binwidth = .001, inherit.aes = F, alpha=1/2)

The model in its current form very slightly underestimates risk at low levels of predicted risk and then overestimates risk at predicted probabilities of over 0.4.

The results above can be compared with rms::val.prob. Note that this uses lowess(p, y, iter=0) to fit a non-linear (nonparametric) calibration curve. This calibration curve suggests that the overestimation at high levels of predicted risk is even more extreme that that suggested by gam calibration curve above. This is particularly evident in the estimate of Emax (0.35 vs 0.21).

library(rms)
val.prob(p = p, y = y)

Note also that the calibration intercept reported by rms::val.prob comes from the same logistic regression as that used to estimate the calibration slope. In logistic_cal the calibration intercept is estimated via a glm with logit transformed predicted probabilities included as an offset term (i.e., with slope fixed to 1 - see, e.g., Van Calster et al., 2016). The calibration slope is estimated via a separate glm.

\ \

Time to event outcome

The code below produces a calibration curve, and associated metrics, for a time-to-event outcome.

library(simsurv)
library(survival)

# simulate some data
n <- 2000
X <- data.frame(id = seq(n), x1 = rnorm(n), x2 = rnorm(n))
X$x3 <- X$x1*X$x2 # interaction

b <- c("x1" = -.2, "x2" = -.2, "x3" = .1)

d <- simsurv(dist = "weibull", lambdas = .01, gammas = 1.5, x = X, betas = b, seed = 246)

mean(d$eventtime)
median(d$eventtime)
mean(d$status) # no censoring

d <- cbind(d, X[,-1])

head(d)

# split into development and validation
ddev <- d[1:1000, ]
dval <- d[1001:2000, ]

# fit a cox model
cph <- coxph(Surv(eventtime, status) ~ x1 + x2, data = ddev)

# predicted probability of event at time = 15
p = 1 - exp(-predict(cph, type="expected", newdata = data.frame(eventtime=15, status=1, x1=dval$x1, x2=dval$x2)))

y <- with(dval, Surv(eventtime, status))

# calibration curve at time = 15
(cc <- pmcalibration(y = y, p = p, smooth = "rcs", nk = 5, ci = "pw", time = 15))
# pointwise standard errors for plot but no CI for metrics
# 'boot' CIs are also available for time to event outcomes

plot(cc)
mtext("time = 15")

Compare to rms::val.surv, which with the arguments specified below uses polspline::hare to fit a calibration curve. Note val.surv uses probability of surviving until time = u not probability of event occurring by time = u.

plot(val.surv(S = y, est.surv = 1-p, u=15, fun = function(x) log(-log(x))))

\ \ \

Any scripts or data that you put into this service are public.

pmcalibration documentation built on Sept. 8, 2023, 5:10 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pmcalibration
Calibration Curves for Clinical Prediction Models

External validation using `pmcalibration`
In pmcalibration: Calibration Curves for Clinical Prediction Models

Binary outcome

Time to event outcome

Try the pmcalibration package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

pmcalibration Calibration Curves for Clinical Prediction Models

External validation using `pmcalibration` In pmcalibration: Calibration Curves for Clinical Prediction Models

Binary outcome

Time to event outcome

Try the pmcalibration package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

pmcalibration
Calibration Curves for Clinical Prediction Models

External validation using `pmcalibration`
In pmcalibration: Calibration Curves for Clinical Prediction Models