knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
#| echo: true #| label: load-packages library(curesurv)
When the cause of death is unknown, the most common method to estimate the cancer-related survival is net survival. Its estimation assumes that the observed hazard $\lambda_{obs}$ is equal to the sum of the known background mortality hazard in the general population $\lambda_{pop}$ (obtained from national Statistic Institutes such as INSEE in France) and the excess hazard (due to cancer) $\lambda_{exc}$. For one individual $i$, this relation can be expressed as:
$$ \lambda_{obs}(t_i|z_i) = \lambda_{pop}(t_i+a_i|z_{pop,i}) + \lambda_{exc}(t_i|z_i) $$ where $Z_{pop}\subset Z$ are the variables of the mortality table and of the model respectively.
The cumulative observed hazard can be written as: $$ \Lambda_{obs}(t|z) = \Lambda_{pop}(t+a|z) + \Lambda_{exc}(t|z) $$ and the net survival is obtained as following: $$ S_{n}(t|z) = \exp(-\Lambda_{exc}(t|z)) $$
The TNEH model is a relatively recent excess hazard model developed by Boussari et al. $\$
The particularity of this model is that it enables the estimation, at the same time as the classical parameters of a model of the excess rate, of a quantity which is obtained by post-estimation by the classical models: it concerns the time after which the excess rate becomes null i.e. the cure point.
The excess hazard proposed can be expressed as following:
$$ \lambda_{exc}(t|z;\theta) = \left(\dfrac{t}{\tau(z;\tau)}\right)^{\alpha(z;\alpha)-1} \left(1 - \dfrac{t}{\tau(z;\tau)}\right)^{\beta-1} 1_{\left{0 \le t \le \tau(z;\tau)\right}} $$
where : $\$
$\tau(z;\tau) > 0$ is the time to cure, depends on covariates z and vector of parameters $\tau$. It corresponds to the vector of parameters fitting the time-to-null excess hazard. $\$
$\alpha(z;\alpha) > 0$ and $\beta > 1$ are shape parameters. With $\beta>1$, the excess hazard is forced to be null and continuous in $\tau(z;\tau)$. $\$
The vector of parameters to be estimated is $\theta = (\alpha, \beta, \tau)$ with $\alpha(z;\alpha*) > 0$ .
$$ \Lambda_{exc}(t|z;\theta) = \tau(z;\tau) B \left( \alpha(z;\alpha), \beta \right) F_{Be} \left( \dfrac{t}{\tau(z;\tau)} ; \alpha(z;\alpha) , \beta \right) $$
where
B is the beta function $\$ $F_{Be}$ is the cumulative distribution function (cdf) of the beta distribution
$$ S_n(t|z) = \exp(-\Lambda_{exc}(t|z)) = \exp\left(-\tau(z;\tau) B \left( \alpha(z;\alpha), \beta \right) F_{Be} \left( \dfrac{t}{\tau(z;\tau)} ; \alpha(z;\alpha),\beta \right)\right) $$
The cure fraction corresponds to the net survival at $t = \tau$ in TNEH model. It can be expressed as:
$$ \pi(z|\theta) = \exp\left(-\Lambda_{exc}(\tau(z;\tau)|z)\right) = \exp\left(-\tau(z;\tau) B \left( \alpha(z;\alpha*), \beta \right)\right) $$
This quantity corresponds to the probability P(t) of being cured at a given time t after diagnosis knowing that he/she was alive up to time t. It can be expressed as following:
$$ P(t|z) = \dfrac{\pi(z|\theta)}{S_n(t|z)} = \exp \left( \tau(z;\tau) \left( B \left( \dfrac{t}{\tau(z;\tau)} ; \alpha(z;\alpha*) , \beta \right) - B(\alpha, \beta) \right) \right) $$ To calculates the confidence intervals of $P(t|z)$, can be obtained using the delta method. The application of this method requires the partial derivatives of $P(t|z)$ with respect of the parameters of the model. This can be written as:
$$ \dfrac{\partial P(t|z)}{\partial \theta} = \dfrac{1}{S_n(t|z)^2} \left( \dfrac{\partial \pi(z|\theta)}{\partial \theta} S_n(t|z) - \dfrac{\partial S_n(t|z)}{\partial \theta} \pi(z|\theta) \right) $$
There are no covariables acting on parameters $\tau$ ($\tau=\tau_0$) and $\alpha$ ($\alpha=\alpha_0$)
#| echo: true #| warning: false #| message: false #| label: nocov fit_ad_tneh_nocov <- curesurv(Surv(time_obs, event) ~ 1, pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture", dist = "tneh", link_tau = "linear", data = testiscancer, method_opt = "L-BFGS-B")
summary(fit_ad_tneh_nocov)
We can see that the time-to-cure $\tau=\tau_0$ is estimated at 5.102 years, and the cure proportion is estimated at 84.74%.
We predict for varying time after diagnosis
#| echo: true #| label: predictnocov newdata1 <- with(testiscancer, expand.grid(event = 0, time_obs = seq(0.1,10,0.1))) p_28 <- predict(object = fit_ad_tneh_nocov, newdata = newdata1)
#| echo: true #| label: plotpredictnocov #| fig.height: 8 #| fig.width: 10 plot(p_28)
The confidence intervals at $1-\alpha$ level for the cure fraction $\pi$ can be written as:
$$ \left[\hat{\pi} \pm z_{1 - \alpha / 2} \sqrt{Var(\hat{\hat{\pi}})}\right] $$ where $$ Var(\hat{\pi}) = \dfrac{\partial \hat{\pi}}{\partial \theta} Var(\theta) \left(\dfrac{\partial \hat{\pi}}{\partial \theta}\right)^T $$
p_28[1,c("lower_bound_pi_tneh","cured","upper_bound_pi_tneh")]
We search the time $\text{t}=\text{TTC}_i$ from which $\text{P}_i(t) = 1-\epsilon$. $\epsilon$ can be fixed to 0.95.
The variance formula can be expressed as:
$$ Var(TTC) = Var(g(\theta;z_i)) \simeq \left(\dfrac{\partial P(t|z_i;\theta)}{\partial t}{|t = TTC}\right)^{-2} Var(P(t|z_i;\theta)){|t=TTC} $$
p_28[1,c("lower_bound_TTC_tneh","time_to_cure_ttc","upper_bound_TTC_tneh")]
We create a new age variable : age_crmin the reduced age and "centered" around the age of the youngest patient
#| echo: true #| warning: false #| message: false testiscancer$age_crmin <- (testiscancer$age- min(testiscancer$age)) / sd(testiscancer$age)
#| echo: false #| warning: false #| message: false oldpar<-par(no.readonly = TRUE) par(mfrow=c(1,1)) plot(testiscancer$age,testiscancer$age_crmin,type="l",xlab="age",ylab="age_crmin") text(x=min(testiscancer$age),y=0,col="blue",round(min(testiscancer$age),2)) lines(x=rep(min(testiscancer$age),2),y=c(-1,0),col="blue",lty=2) lines(x=c(-1,min(testiscancer$age)),y=c(0,0),col="blue",lty=2) text(x=min(testiscancer$age)+sd(testiscancer$age),y=1.2,col="red",round(sd(testiscancer$age)+min(testiscancer$age),2)) lines(x=rep(min(testiscancer$age)+sd(testiscancer$age),2),y=c(-1,1),col="red",lty=2) lines(x=c(-1,min(testiscancer$age)+sd(testiscancer$age)),y=c(1,1),col="red",lty=2) par(oldpar)
This model has the variable age_crmin acting on both $\alpha$ and $\tau$ : ($\alpha=\alpha_0+Z_{age_crmin}\times\alpha_1$ et $\tau=\tau_0+Z_{age_crmin}\times\tau_1$)
#| echo: true #| warning: false #| message: false #| label: withcov fit_m1_ad_tneh <- curesurv(Surv(time_obs, event) ~ z_tau(age_crmin) + z_alpha(age_crmin), pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture", dist = "tneh", link_tau = "linear", data = testiscancer, method_opt = "L-BFGS-B")
summary(fit_m1_ad_tneh)
For the reference individual (that is the patient with age_crmin=0, so the youngest patient, approximately 20y old at diagnosis), the cure proportion is estimated at 96,75% and the time to null excess hazard at 3.258 year. For an individual whose age_ is_crmin is 1 (that is they are aged the standard deviation more than the youngest, that is approximately 39y old at diagnosis), the time to null excess hazard is 6.721 year.
With varying time for patient of mean age
#| echo: true #time varying prediction for patient with age mean newdata1 <- with(testiscancer, expand.grid(event = 0, age_crmin = mean(age_crmin), time_obs = seq(0.001,10,0.1))) pred_agemean <- predict(object = fit_m1_ad_tneh, newdata = newdata1)
With varying time for oldest patient
#| echo: true #time varying prediction for patient with age max newdata2 <- with(testiscancer, expand.grid(event = 0, age_crmin = max(age_crmin), time_obs = seq(0.001,10,0.1))) pred_agemax <- predict(object = fit_m1_ad_tneh, newdata = newdata2)
At 2 years after diagnostic for patients of different ages
#| echo: true # predictions at time 2 years with varying age newdata3 <- with(testiscancer, expand.grid(event = 0, age_crmin = seq(min(testiscancer$age_crmin), max(testiscancer$age_crmin), 0.1), time_obs = 2)) pred_age_val <- predict(object = fit_m1_ad_tneh, newdata = newdata3) val_age <- seq(min(testiscancer$age_crmin), max(testiscancer$age_crmin), 0.1) * sd(testiscancer$age) + min(testiscancer$age)
#| echo: true #| label: predict_tnehcov #| fig.height: 10 #| fig.width: 10 par(mfrow = c(2, 2),cex = 1.0) plot(pred_agemax$time,pred_agemax$ex_haz,type = "l",lty = 1,lwd = 2, xlab = "Time since diagnosis", ylab = "excess hazard") lines(pred_agemean$time,pred_agemean$ex_haz,type = "l",lty = 2,lwd = 2) legend("topright",horiz = FALSE, legend = c("hE(t) age.max = 79.9", "hE(t) age.mean = 50.8"), col = c("black", "black"),lty = c(1, 2, 1, 1, 2, 2)) grid() plot(pred_agemax$time,pred_agemax$netsurv,type = "l",lty = 1,lwd = 2, ylim = c(0, 1),xlab = "Time since diagnosis",ylab = "net survival") lines(pred_agemean$time,pred_agemean$netsurv,type = "l",lty = 2,lwd = 2) legend("bottomleft",horiz = FALSE, legend = c("Sn(t) age.max = 79.9", "Sn(t) age.mean = 50.8"), col = c("black", "black"),lty = c(1, 2, 1, 1, 2, 2)) grid() plot(pred_agemax$time,pred_agemax$pt_cure,type = "l",lty = 1,lwd = 2,ylim = c(0, 1), xlab = "Time since diagnosis",ylab = "probability of being cured P(t)") lines(pred_agemean$time,pred_agemean$pt_cure,type = "l",lty = 2,lwd = 2) abline(v = pred_agemean$tau[1],lty = 2,lwd = 2,col = "blue") abline(v = pred_agemean$time_to_cure_ttc[1],lty = 2,lwd = 2,col = "red") abline(v = pred_agemax$tau[1],lty = 1,lwd = 2,col = "blue") abline(v = pred_agemax$time_to_cure_ttc[1],lty = 1,lwd = 2,col = "red") grid() legend("bottomright",horiz = FALSE, legend = c("P(t) age.max = 79.9","P(t) age.mean = 50.8", "TNEH age.max = 79.9","TTC age.max = 79.9", "TNEH age.mean = 50.8","TTC age.mean = 50.8"), col = c("black", "black", "blue", "red", "blue", "red"), lty = c(1, 2, 1, 1, 2, 2)) par(oldpar)
#| echo: true #| label: predicted_to_plot_by_age #| fig.height: 8 #| fig.width: 10 par(mfrow=c(2,2)) plot(val_age,pred_age_val$ex_haz, type = "l",lty=1, lwd=2, xlab = "age",ylab = "excess hazard 2y after diagnosis") grid() plot(val_age,pred_age_val$netsurv, type = "l", lty=1,lwd=2,ylim=c(0,1), xlab = "age", ylab = "net survival 2y after diagnosis") grid() plot(val_age,pred_age_val$pt_cure, type = "l", lty=1, lwd=2,ylim=c(0,1), xlab = "age",ylab = "P(t) 2y after diagnosis") grid() plot(val_age,pred_age_val$cured, type = "l", lty=1, lwd=2,ylim=c(0,1), xlab = "age", ylab = "cure proportion") grid() par(oldpar)
Effects of centered and reduced age on $\tau$ ($tau=\tau_0+Z_{age_cr}\times\tau_1$, $\alpha=\alpha_0$)
#| echo: true #| label: withtauonly #| warning: false #| message: false fit_ad_tneh_covtau <- curesurv( Surv(time_obs, event) ~ z_tau(age_cr), pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture", dist = "tneh", link_tau = "linear", data = testiscancer, method_opt = "L-BFGS-B" )
#| echo: true #| label: withtauonly summary(fit_ad_tneh_covtau)
This model estimates a cure proportion of 79.4% for individuals of mean age (51.05 year old at diagnosis) with a time to null excess hazard 7.44 year. For an individual of age at diagnosis 70.01 (age_cr=1), the time to null excess hazard is estimated at 9.754. This model estimates that the time to null excess hazard increases by 2.3159 when age_cr increases by (that is, when age at diagnosis increases by 18.96)
#| echo: true #| label: predicted summary(testiscancer$age_cr) summary(testiscancer$age) newdata2 <- with(testiscancer, expand.grid(event = 0, time_obs = seq(0.1, 10, 0.1), age_cr = c(-0.9358, 0.0276, 0.9525) )) newdata2_1stqu <- newdata2[newdata2$age_cr==-0.9358,] newdata2_2rdqu <- newdata2[newdata2$age_cr==0.0276,] newdata2_3rdqu <- newdata2[newdata2$age_cr==0.9525,] p1stqu <- predict(object = fit_ad_tneh_covtau, newdata = newdata2_1stqu) p2rdqu <- predict(object = fit_ad_tneh_covtau, newdata = newdata2_2rdqu) p3rdqu <- predict(object = fit_ad_tneh_covtau, newdata = newdata2_3rdqu)
#| echo: true #| label: plot_for_specific_age #| fig.height: 8 #| fig.width: 10 par(mfrow=c(2,2)) plot(p1stqu, main = "Excess hazard for age 33.3", fun = "haz") plot(p2rdqu, fun = "haz", main = "Excess hazard for age 51.57") plot(p3rdqu, fun = "haz", main = "Excess hazard for age 69.11") par(oldpar)
Effect of age_cr on only $\alpha$ ($\alpha=\alpha_0+Z_{age_cr}*\alpha_1$, $\tau=\tau_0$)
#| echo: true #| label: only_covariate_on_alpha #| message: false #| warning: false fit_ad_tneh_covalpha <- curesurv( Surv(time_obs, event) ~ z_alpha(age_cr), pophaz = "ehazard", cumpophaz = "cumehazard", model = "nmixture", dist = "tneh", link_tau = "linear", data = testiscancer, method_opt = "L-BFGS-B" )
#| echo: true #| message: false #| warning: false summary(fit_ad_tneh_covalpha)
The time to null excess hazard is estimated at 6.099 year for all individuals, and the cure proportion is estimated at 81.81% for patient of mean age (51.05 yo at diagnosis)
#| echo: true #| message: false #| warning: false #| include: true p4_33.3 <- predict(object = fit_ad_tneh_covalpha, newdata = newdata2_1stqu) p4_51.6 <- predict(object = fit_ad_tneh_covalpha, newdata = newdata2_2rdqu) p4_69.1 <- predict(object = fit_ad_tneh_covalpha, newdata = newdata2_3rdqu)
#| echo: true #| label: plot_for_specific_age_2 #| fig.height: 8 #| fig.width: 10 par(mfrow=c(2,2)) plot(p4_33.3, main = "Pt_cure for age 33.3", fun = "pt_cure") plot(p4_51.6, fun = "pt_cure", main = "Pt_cure for age 51.6") plot(p4_69.1, fun = "pt_cure", main = "Pt_cure for age 69.1") par(oldpar)
We fitted 4 models and to compare them we can use the AIC criteria
AIC(fit_ad_tneh_nocov,fit_ad_tneh_covalpha,fit_ad_tneh_covtau,fit_m1_ad_tneh)
The best model is the one with the lowest AIC : it's the one with an effect of age on both \$ \tau\$ and $\alpha$.
date() sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.