In bcjaeger/survival.calib: Tests of Calibration in the Survival Setting

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

survival.calib

The goal of survival.calib is to provide convenient access to tests for mis-calibration in the survival setting. It provides functions to help users apply these tests in a highly customized way (see gnd_test_manual) or following a protocol that the developers recommend (see gnd_test_auto).

Installation

You can install survival.calib from GitHub with:

remotes::install_github("bcjaeger/survival.calib")

Example

Let's apply the Namwood-Green-D'Agostino (GND) test to assess calibration of a proportional hazards model. First, we load packages and curate some data for modeling.

library(survival.calib)
library(survival)
library(riskRegression)

model_vars <- c('futime', 'death', 'age', 'sex', 'flc.grp', 'lambda')
model_data <- survival::flchain[, model_vars]

head(model_data)

Next, we can

designate a set of rows that will define the training set,
fit a model to the training set,
predict risk for mortality in a testing set.

# step 1
set.seed(730)
train_index <- sample(x = nrow(model_data), size = 2000)

# step 2
risk_mdl <- coxph(
 formula = Surv(futime, death) ~ age + sex + flc.grp + lambda,
 data = flchain[train_index,],
 x = TRUE
)

# step 3
pred_horizon <- 4000
pred_risk <- predictRisk(risk_mdl,
                         newdata = flchain[-train_index, ],
                         times = pred_horizon)

Once we have predicted risk values, we can create a scalib (survival calibration) object:

sc <- scalib(pred_risk = list(cph = pred_risk),
             pred_horizon = pred_horizon,
             event_time = flchain$futime[-train_index],
             event_status = flchain$death[-train_index])

print(sc)

Now we can go in two directions:

Create risk groups by hand and pass the risk group assignments into scalib_gnd_manual.
Use scalib_gnd, which creates risk groups for you.

We'll show both approaches.

GND test (manual approach)

The manual GND test requires a group input value that designates which group each observation in the testing data belongs to. Creation of the group input is up to the user.

# Create risk groups:
pctls <- quantile(pred_risk, probs = c(0:10) / 10)
risk_groups <- cut(pred_risk, breaks = pctls, include.lowest = TRUE)
risk_groups <- as.numeric(risk_groups)

# supply predicted risk, observed outcomes, 
# time of prediction, and group assignments
# to the manual GND test function.

gnd_result_manual <- scalib_gnd_manual(scalib_object = sc, 
                                       pred_risk_col = 'cph',
                                       group = risk_groups)

# print the updated scalib object
gnd_result_manual

GND test (automatic approach)

Demler et al. show that the GND test has high variability if there are less than 5 events in any of the risk groups. When this occurs, the authors recommend lumping groups with < 5 events into the nearest risk group. The gnd_test_auto automates this approach, allowing users to forego manual creation of risk groups. For example, below we apply gnd_test_auto with group_count_init = 45, which initiates the following algorithm:

Create 45 groups based on quantiles of predicted risk.
Check the event counts in each group
If any groups have < 5 events, collapse the group with fewest events into its neighbor, and repeat step 2
run gnd_test_manual with the (potentially modified) groups.

gnd_result_auto <- scalib_gnd(scalib_object = sc,
                              group_method = 'lump', 
                              group_count_init = 45,
                              verbose = 1)

# print the updated scalib object
gnd_result_auto

You can see from the printed output that the p-value with 10 groups is different from the p-value with 43 groups. Simulation studies have focused on using 10 groups and this approach has shown good statistical properties. So, use 10 groups (the default value for group_count_init).

References

Demler, O.V., Paynter, N.P. and Cook, N.R., 2015. Tests of calibration and goodness‐of‐fit in the survival setting. Statistics in medicine, 34(10), pp.1659-1680. DOI: 10.1002/sim.6428

bcjaeger/survival.calib documentation built on June 15, 2022, 7:47 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com