knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
curesurv
is an R-package to fit a variaty of cure models using excess hazard modelling methodology. It can be a mixture cure model with the survival of uncured patients following a Weibull (Botta et al. 2023, BMC Medical Research Methodology). The package also implements non-mixture cure models such as the time-to-null excess hazard model proposed by Boussari et al (2021) [@boussari2021modeling].
If the modelling assumption of comparability between the expected hazard in the study cohort and the general population doesn't hold, an extra effect (due to life table mismatch) can be estimated for these two classes of cure models. In the following we will only be interested by mixture cure models implemented in the curesurv
R-package.
For more details regarding mixture cure models in net survival setting please read the methods section in the article untitled: "A new cure model that corrects for increased risk of non-cancer death. Analysis of reliability and robustness, and application to real-life data" in BMC medical research methodology.
The latest version of curesurv
can be installed using the tar.gz version of the R-package uploaded to the journal website using the following r script:
install.packages("curesurv_0.1.2.tar.gz", repos = NULL, type = "source")
curesurv
depends on the stringr
and survival
R packages, which can be installed directly from CRAN.
It also uses other R packages that can be imported, such as: numDeriv
, stats
, randtoolbox
, bbmle
, optimx
, Formula
, Deriv
, statmod
, and xhaz
.
Once these other packages are installed, load the curesurv
R package.
library(xhaz) library(survexp.fr) library(curesurv)
We illustrate the fitting of mixture cure models using a simulated dataset named testiscancer
. This dataset is available in the curesurv
package. It consists of 2,000 patients with information on their age, follow-up time, and vital status. The patients are assumed to be male and diagnosed in 2000-01-01 for a testis cancer. The French life table in the R package survexp.fr
can be used as the mortality table of the French general population for illustration purposes.
# We had these information in the dataset data("testiscancer", package = "curesurv") head(testiscancer) dim(testiscancer) testiscancer$sex <- "male" levels(testiscancer$sex) <- c("male", "female") testiscancer$year <- as.Date("2000-01-01")
Here we show how to extract background mortality information from survexp.fr
[-@jais2022package], the life table for France available in the eponymous R package.
attributes(survexp.fr) fit.haz <- exphaz( formula = Surv(time_obs, event) ~ 1, data = testiscancer, ratetable = survexp.fr, only_ehazard = FALSE, rmap = list(age = 'age', sex = 'sex', year = 'year'))
The testiscancer database already has instantaneous population hazard and cumulative population hazard. Otherwise, it would be possible to obtain them from fit.haz$ehazard
and fit.haz$ehazardInt
#instantaneous population hazard testiscancer$haz <- testiscancer$ehazard #cumulative population hazard testiscancer$cumhaz <- testiscancer$cumehazard
We fit a mixture cure model in the excess hazard modeling setting. We assume that reduced and centered age affect the cure rate through a logistic link function and uncured survival proportionally [@de1999mixture].
fit_mod0 <- curesurv(Surv(time_obs, event) ~ age_cr | age_cr, pophaz = "haz", cumpophaz = "cumhaz", model = "mixture", dist = "weib", data = testiscancer, method_opt = "L-BFGS-B") fit_mod0
We found that the effect of reduced and centered age on survival of uncured patients was 0.29 with a standard error of 0.08. The cure rates and their 95% confidence intervals for patients 51.05 and 70.01 years of age (representing individuals with the average age in the study cohort and that of individuals whose age corresponds to an increase of one unit of reduced centered age) were 77.67% [74.76; 80.60] and 51.15% [46.20; 56.10], respectively.
We fit a new mixture cure model in the excess hazard modelling setting proposed by Botta et al. 2023. It allows the background mortality to be corrected using the argument pophaz.alpha = TRUE
, in order to account for increased non-cancer mortality as in [@phillips2002estimating]. As previously, the new model assumes that reduced and centered age affects the cure rate through a logistic link function and the uncured survival proportionally. This model was presented at the 15th Francophone Conference on Clinical Epidemiology (EPICLIN) and at the 28th Journées des statisticiens des centres anticancéreux (CLCC), in Marseille, France [@BOTTA2021S24].
fit_mod1 <- curesurv(Surv(time_obs, event) ~ age_cr | age_cr, pophaz = "haz", cumpophaz = "cumhaz", model = "mixture", dist = "weib", data = testiscancer, pophaz.alpha = TRUE, method_opt = "L-BFGS-B") fit_mod1
We found that the effect of reduced and centered age on survival of uncured patients was higher in the new model and was 0.35 with a standard error of 0.13. The cure rates and their 95% confidence intervals for patients 51.05 and 70.01 years of age (representing individuals with the average age in the study cohort and that of individuals whose age corresponds to an increase of one unit of reduced centered age) were also higher with the new mixed cure model, with estimates of 86.73% [83.69; 89.80]% and 81.59% [77.19; 86.00]%, respectively. The new mixture cure model also estimated the increased risk of non-cancer death to be 1.96, with 95% confidence intervals [1.77; 2.14] not including 1. These results support the hypothesis of increased non-cancer mortality in the simulated testicular cancer patients.
We can compare the output of these two models using Akaike information criteria (AIC).
AIC(fit_mod0,fit_mod1)
The best model was the new mixture cure model accounting for increased risk of non-cancer death.
We can be interested in the prediction of excess hazard and net survival. We propose to calculate these predictions of excess hazard and net survival at equal time 2 years since diagnosis, as a function of centered and reduced age ranging from -1.39 to 1.5, which corresponds to age values ranging from 24.69 to 83.48. We also added the curves of P(t), the probability of being cured at a given time t after diagnosis knowing that he/she was alive until the time 2 years since diagnosis [@BOUSSARI201872],which can be obtained from the estimates of these models.
val_age <- seq(-1.39, 1.8, 0.1) newage <- round(val_age * sd(testiscancer$age) + mean(testiscancer$age), 2) newdata3 <- with(testiscancer, expand.grid( event = 0, age_cr = val_age, time_obs = 2)) pred_age_mod0 <- predict(object = fit_mod0, newdata = newdata3) pred_age_mod1 <- predict(object = fit_mod1, newdata = newdata3)
oldpar <- par(no.readonly = TRUE) par(mfrow=c(2,2)) plot(newage, pred_age_mod0$ex_haz, type = "l", lty=1, lwd=2, xlab = "age", ylab = "excess hazard") lines(newage, pred_age_mod1$ex_haz, type = "l", lty=1, lwd=2, col = "blue") legend("topleft", legend = c("M0", "M1"), lty = c(1,1), lwd = c(2,2), col = c("black", "blue")) grid() plot(newage, pred_age_mod0$netsurv , type = "l", lty=1, lwd=2, xlab = "age", ylab = "net survival") lines(newage, pred_age_mod1$netsurv , type = "l", lty=1, lwd=2, col = "blue") grid() legend("bottomleft", legend = c("M0", "M1"), lty = c(1,1), lwd = c(2,2), col = c("black", "blue")) plot(newage, pred_age_mod0$pt_cure, type = "l", lty=1, lwd=2, xlab = "age", ylab = "P(t)") grid() lines(newage, pred_age_mod1$pt_cure, type = "l", lty=1, lwd=2, xlab = "age", col = "blue") grid() legend("bottomleft", legend = c("M0", "M1"), lty = c(1,1), lwd = c(2,2), col = c("black", "blue")) par(oldpar)
We propose to calculate these predictions of excess hazard and net survival at varying time since diagnosis, and at fixed age 50 and 70, the median and 3rd quartile. We also added the curves of P(t), the probability of being cured at a given time t after diagnosis knowing that he/she was alive until the time t since diagnosis [@BOUSSARI201872],which can be obtained from the estimates of these models.
age50 <- c(50) agecr50 <- (age50 - mean(testiscancer$age))/sd(testiscancer$age) age70 <- c(70) agecr70 <- (age70 - mean(testiscancer$age))/sd(testiscancer$age) time <- seq(0.1, 15, 0.1) newdata_age50 <- with(testiscancer, expand.grid( event = 0, age_cr = agecr50, time_obs = time)) newdata_age70 <- with(testiscancer, expand.grid( event = 0, age_cr = agecr70, time_obs = time)) pred_age50_mod0 <- predict(object = fit_mod0, newdata = newdata_age50) pred_age70_mod0 <- predict(object = fit_mod0, newdata = newdata_age70) pred_age50_mod1 <- predict(object = fit_mod1, newdata = newdata_age50) pred_age70_mod1 <- predict(object = fit_mod1, newdata = newdata_age70)
It is possible to plot these predictions directly using plot. With fun
parameter you can choose between fun="all"
which plot excess hazard, net survival and probability of being cured knowing patient is alive, fun="haz"
which plot excess hazard, fun="surv"
which plots net survival, fun="pt_cure"
which plots probability of being cured at t knowing patient is alive at t, fun="cumhaz"
for cumulative excess hazard and fun="logcumhaz"
for logairthm of cumulativ excess hazard. For fun="surv"
, fun="haz"
, fun="pt_cure"
and fun="logcumhaz"
it is also possible to include confidence interval using conf.int=TRUE
. These options are shown below
plot(pred_age50_mod0, fun="all",conf.int=FALSE,lwd.main = 2,lwd.sub = 2)
plot(pred_age50_mod0, fun="haz",conf.int=TRUE,conf.type="plain") plot(pred_age50_mod0, fun="surv",conf.int=TRUE,conf.type="log-log") plot(pred_age50_mod0, fun="pt_cure",conf.int=TRUE,conf.type="log") plot(pred_age50_mod0, fun="cumhaz",conf.int=FALSE) plot(pred_age50_mod0, fun="logcumhaz",conf.int=TRUE,conf.type="plain")
This plot function specific for curesurv predictions can be very useful, however it only works for time varying (other variable fixed) predictions, and it doesn't allow comparison between models. For this reason we also show how to plot predictions from different models in one figure :
par(mfrow=c(2,2)) plot( time, pred_age50_mod0$ex_haz, type = "l", lty = 1, lwd = 2, xlab = "time since diagnosis (years)", ylab = "excess hazard", ylim = c(0,0.20)) lines( time, pred_age50_mod1$ex_haz, type = "l", col = "blue", lty = 1, lwd = 2) lines( time, pred_age70_mod0$ex_haz, type = "l", lty = 2, lwd = 2) lines( time, pred_age70_mod1$ex_haz, type = "l", col = "blue", lty = 2, lwd = 2) grid() legend("topright", legend = c("M0 - age = 50", "M1 - age = 50", "M0- age = 70", "M1- age = 70"), lty = c(1,1,2,2), lwd = c(2,2,2,2), col = c("black", "blue", "black", "blue")) plot(time, pred_age50_mod0$netsurv , type = "l", lty=1, lwd=2, xlab = "time since diagnosis (years)", ylab = "net survival", ylim = c(0,1)) lines(time, pred_age50_mod1$netsurv , type = "l", lty=1, lwd=2, col = "blue") lines(time, pred_age70_mod0$netsurv , type = "l", lty=2, lwd=2) lines(time, pred_age70_mod1$netsurv , type = "l", lty=2, lwd=2, col = "blue") grid() legend("bottomleft", legend = c("M0 - age = 50", "M1 - age = 50", "M0- age = 70", "M1- age = 70"), lty = c(1,1,2,2), lwd = c(2,2,2,2), col = c("black", "blue", "black", "blue")) plot(time, pred_age50_mod0$pt_cure, type = "l", lty=1, lwd=2, xlab = "time since diagnosis (years)", ylab = "P(t)", ylim = c(0,1)) grid() lines(time, pred_age50_mod1$pt_cure, type = "l", lty=1, lwd=2, xlab = "age", col = "blue") lines(time, pred_age70_mod0$pt_cure, type = "l", lty=2, lwd=2) lines(time, pred_age70_mod1$pt_cure, type = "l", lty=2, lwd=2, xlab = "age", col = "blue") grid() legend("bottomright", legend = c("M0 - age = 50", "M1 - age = 50", "M0- age = 70", "M1- age = 70"), lty = c(1,1,2,2), lwd = c(2,2,2,2), col = c("black", "blue", "black", "blue")) par(oldpar)
date() sessionInfo()
GPL 3.0, for academic use.
We are grateful to the "Fondation ARC pour la recherche sur le cancer" for its support of this work.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.