Transforms cumulative hazard estimates to estimate survival analysis parameters specified by differential equations. We demonstrate how the method works on some commonly studied parameters.
We are interested in assessing parameters that solve differential equations driven by cumulative hazards $A$, i.e. \begin{equation}X_t = X_0 + \int_0^t F(X_s) dA_s,\end{equation} where $F = (F_1,F_2,\cdots)$ is Lipschitz, two times continuously differentiable, and satisfies a linear growth bound. Several examples of such parameters can be found in [^fn1] [^fn2]. The main function pluginEstimate in this package estimate such parameters. Among other things, pluginEstimate has the following input
We illustrate how to provide the correct inputs by use of examples below.
knitr::opts_chunk$set(fig.width=7, fig.height=5, fig.path='Figs/', echo=FALSE, warning=FALSE, message=FALSE) library(timereg) library(transform.hazards)
The survival function $S$ solves a differential equation with initial value 1; \begin{equation} S_t = 1 - \int_0^t S_{s} dA_s.\end{equation}
We generate a sample of exponentially distributed survival times with independent censoring, and calculate cumulative hazard estimates
n1 <- 300 Samp1 <- rexp(n1,1) cTimes <- runif(n1)*4 Samp1 <- pmin(Samp1,cTimes) isNotCensored <- cTimes != Samp1 startTimes1 <- rep(0,n1) aaMod1 <- aalen(Surv(startTimes1,Samp1,isNotCensored)~1)
We want to estimate the survival function. The cumulative hazard estimates are inserted as a time-ordered matrix where each column is an increment of the vector of cumulatie hazard estimates;
dA_est1 <- diff(c(0,aaMod1$cum[,2])) hazMatrix <- matrix(dA_est1,nrow=1)
In this case, $F$ takes the simple form $F(x) = -x$, and its Jacobian is therefore $J_{F}(x) = -1$. These must be provided as matrix-valued functions;
F_survival <- function(X)matrix(-X,nrow=1,ncol=1) F_survival_JacobianList <- list(function(X)matrix(-1,nrow=1,ncol=1))
We obtain plugin estimates of $S$ and its covariance using the call
param <- pluginEstimate(n1, hazMatrix, F_survival, F_survival_JacobianList,matrix(1,nrow=1,ncol=1),matrix(0,nrow=1,ncol=1))
Finally, we may plot the results with approximate 95 % confidence intervals
times1 <- aaMod1$cum[,1] plot(times1,param$X,type="s",xlim=c(0,4),xlab="t",ylab="",main="Survival") lines(times1,param$X + 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2) lines(times1,param$X - 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2) lines(times1,exp(-times1),col=2) legend("topright",c("estimated","true"),lty=1,col=c(1,2),bty="n")
The restricted mean survival function $R$ solves the system \begin{equation} \begin{pmatrix} S_t \ R_t \end{pmatrix} = \begin{pmatrix}1 \ 0 \end{pmatrix} + \int_0^t \begin{pmatrix} -S_{s} & 0 \ 0 & S_s \end{pmatrix}d \begin{pmatrix}A_s \ s \end{pmatrix},\end{equation}
where $S$ is the survival function. Here, the colums of $F$, $F_1$ and $F_2$, are given by \begin{equation} F_1(x_1,x_2) = \begin{pmatrix} -x_1 \ 0 \end{pmatrix}, F_2(x_1,x_2) = \begin{pmatrix} 0 \ x_1 \end{pmatrix}.\end{equation} The Jacobian matrices are therefore \begin{equation} J_{F_1}(x_1,x_2) = \begin{pmatrix} -1 & 0 \ 0 & 0 \end{pmatrix}, J_{F_2}(x_1,x_2) = \begin{pmatrix} 0 & 0 \ 1 & 0 \end{pmatrix}. \end{equation} We define these along with initial values below
F_restrict <- function(X)matrix(c(-X[1],0,0,X[1]),nrow=2) F_restrict_JacobianList <- list(function(X)matrix(c(-1,0,0,0),nrow=2), function(X)matrix(c(0,0,1,0),nrow=2,byrow=T)) X0_restrict <- matrix(c(1,0),nrow=2) V0_restrict <- matrix(0,nrow=2,ncol=2)
The restricted mean survival is a 'regular' (i.e. Lebesgue) integral, and we must therefore provide the time increments. We choose the time interval $[0,4]$ in $10^4$ increments:
fineTimes <- seq(0,4,length.out = 1e4+1) tms <- sort(unique(c(fineTimes,times1))) hazMatrix <- matrix(0,nrow=2,ncol=length(tms)) hazMatrix[1,match(times1,tms)] <- dA_est1 hazMatrix[2,] <- diff(c(0,tms))
We obtain plugin estimates using the call (note the last argument that in this example can be used to improve efficiency);
param <- pluginEstimate(n1, hazMatrix, F_restrict, F_restrict_JacobianList,X0_restrict,V0_restrict,isLebesgue = 2)
We plot the results
plot(tms,param$X[2,],type="s",xlim=c(0,4),ylim=c(0,1.5),xlab="t",ylab="",main="Restricted mean survival") lines(tms,param$X[2,] + 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2) lines(tms,param$X[2,] - 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2) lines(tms,1 - exp(-tms),col=2) legend("topleft",c("estimated","true"),lty=1,col=c(1,2),bty="n")
We generate another set of exponentially distributed survival times and compare the two groups. A new matrix of cumulative hazard increments must be created;
n2 <- 200 Samp2 <- rexp(n2,1.3) censTimes2 <- runif(n2)*3 Samp2 <- pmin(Samp2,censTimes2) isNotCensored2 <- censTimes2 != Samp2 startTimes2 <- rep(0,n2) aaMod2 <- aalen(Surv(startTimes2,Samp2,isNotCensored2)~1) dA_est2 <- diff(c(0,aaMod2$cum[,2])) times2 <- aaMod2$cum[,1] times <- sort(unique(c(times1,times2))) hazMatrix <- matrix(0,nrow=2,ncol=length(times)) hazMatrix[1,match(times1,times)] <- dA_est1 hazMatrix[2,match(times2,times)] <- dA_est2
The relative survival function $RS$ between groups 1 and 2 solve the equation \begin{equation} RS_t = 1 + \int_0^t \begin{pmatrix} -RS_s & RS_s\end{pmatrix} d\begin{pmatrix} A_s^1 \ A_s^2 \end{pmatrix}, \end{equation}
i.e. $F(x) = (-x,x)$. The Jacobians of the columns of $F$ are $J_{F_1}(x) = -1$, and $J_{F_2}(x) = 1$. We specify these functions as follows:
F_relsurv <- function(X)matrix(c(-X,X),nrow=1) F_relsurv_JacobianList <- list(function(X)matrix(-1,nrow=1), function(X)matrix(1,nrow=1))
The estimates are then obtained by the call
param <- pluginEstimate(n1+n2, hazMatrix, F_relsurv, F_relsurv_JacobianList,matrix(1,nrow=1,ncol=1),matrix(0,nrow=1,ncol=1))
We may plot the results with approximate 95% confidence intervals
plot(times,param$X,type="s",xlim=c(0,4),ylim=c(-0.5,4),xlab="t",ylab="",main="Relative survival") lines(times,param$X + 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2) lines(times,param$X - 1.96 * sqrt(param$covariance[1,1,]),type="s",lty=2) lines(times,exp(-times*(1/1.3 - 1)),col=2) legend("topleft",c("estimated","true"),lty=1,col=c(1,2),bty="n")
We may be in a situation with two competing risks. The cumulative incidences $C^1,C^2$ solve the system \begin{equation} \begin{pmatrix} S_t \ C_t^1 \ C_t^2 \end{pmatrix} = \begin{pmatrix} 1 \ 0 \ 0 \end{pmatrix} + \int_0^t \begin{pmatrix} -S_s & -S_s \ S_s & 0 \ 0 & S_s \end{pmatrix} d \begin{pmatrix} A^1_s \ A^2_s \end{pmatrix}, \end{equation} where $S$ is the survival. The first columns of $F$ are
\begin{equation}F_1(x_1,x_2,x_2) = \begin{pmatrix} -x_1 \ x_1 \ 0 \end{pmatrix}, F_2(x_1,x_2,x_2) = \begin{pmatrix} -x_1 \ 0 \ x_1 \end{pmatrix} \end{equation}
The reader may verify that the Jacobian matrix of $F_1$ and $F_1$ are \begin{equation} J_{F_1}(x_1,x_2,x_3) = \begin{pmatrix} -1 & 0 & 0 \ 1 & 0 & 0 \ 0 & 0 & 0 \end{pmatrix} , J_{F_2}(x_1,x_2,x_3) = \begin{pmatrix} -1 & 0 & 0 \ 0 & 0 & 0 \ 1 & 0 & 0 \end{pmatrix} \end{equation}
We specify the function $F$ and the list of the Jacobian matrices below, along with the initial values:
F_cuminc <- function(X)matrix(c(-X[1],-X[1],X[1],0,0,X[1]),nrow=3,byrow=T) F_cuminc_JacobianList <- list(function(X)matrix(c(-1,0,0,1,0,0,0,0,0),nrow=3,byrow=T), function(X)matrix(c(-1,0,0,0,0,0,1,0,0),nrow=3,byrow=T)) X0_cuminc <- matrix(c(1,0,0),nrow=3) v0_cuminc <- matrix(0,nrow=3,ncol=3)
Now for obtaining hazard estimates. We generate survival times first, before sampling two causes of death (and censoring). We then estimate the cause-specific cumulative hazards, and transform the estimates:
# Generate the data dfr <- data.frame(from=0,to=rexp(n1+n2,1),from.state = 1) # Adding causes of death (to.state = 2 or 3) and censoring (to.state = 0) dfr$to.state <- sample(c(0,2,3),n1+n2,replace=T,prob=c(0.2,0.2,0.6)) # Obtaining cause-specific cumulative hazard estimates and extracting the increments aamod22 <- aalen(Surv(from,to,to.state %in% c(2)) ~1, data=dfr) aamod33 <- aalen(Surv(from,to,to.state %in% c(3)) ~1, data=dfr) tmss = sort(unique(c(aamod22$cum[,1],aamod33$cum[,1]))) hazMatrix = matrix(0,nrow=2,ncol=length(tmss)) mt1 = match(aamod22$cum[,1],tmss) mt2 = match(aamod33$cum[,1],tmss) hazMatrix[1,mt1] = diff(c(0, aamod22$cum[,2] )) hazMatrix[2,mt2] = diff(c(0, aamod33$cum[,2] )) # Transforming the estimates param <- pluginEstimate(n1+n2,hazMatrix,F_cuminc,F_cuminc_JacobianList,X0_cuminc,v0_cuminc)
Finally, we plot the estimates of the cumulative incidences $C^1$ and $C^2$ with confidence intervals:
plot(tmss,param$X[2,],type="s",xlim=c(0,4),ylim=c(0,1),xlab="t",ylab="",main="Cumulative incidence") lines(tmss,param$X[2,] + 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2) lines(tmss,param$X[2,] - 1.96 * sqrt(param$covariance[2,2,]),type="s",lty=2) lines(tmss,param$X[3,],type="s",col=3) lines(tmss,param$X[3,] + 1.96 * sqrt(param$covariance[3,3,]),type="s",lty=2,col=3) lines(tmss,param$X[3,] - 1.96 * sqrt(param$covariance[3,3,]),type="s",lty=2,col=3) legend("topleft",c(expression(C^1),expression(C^2)),lty=1,col=c(1,3),bty="n")
By re-specifying the ODE system we can compare two groups. We illustrate this in the restricted mean survival example. Consider the system
\begin{equation} \begin{pmatrix} R_t^1 \ R^2 \ RD \ S^1 \ S^2 \end{pmatrix} = \begin{pmatrix} 0 \ 0 \ 0 \ 1 \ 1 \end{pmatrix} + \int_0^t \begin{pmatrix} S^1_{s} & 0 & 0 \ S^2_{s} & 0 & 0 \ S^1_{s} - S^2_{s} & 0 & 0 \ 0 & -S_s^1 & 0 \ 0 & 0 & -S_s^2 \end{pmatrix}d \begin{pmatrix}s \ A_s^1 \ A_s^2 \end{pmatrix},\end{equation}
where $S^i$ is the survival function in group $i$. We specify matrix-valued functions and initial values as before:
F_restrict_diff <- function(X)matrix(c(X[4],0,0, X[5],0,0, X[4]-X[5],0,0, 0,-X[4],0, 0,0,-X[5]),nrow=5,byrow=T) F_restrict_diff_JacobianList <- list(function(X)matrix(c(0,0,0,1,0, 0,0,0,0,1, 0,0,0,1,-1, rep(0,10)),nrow=5,byrow=T), function(X)matrix(c(rep(0,18),-1,rep(0,6)),nrow=5,byrow=T), function(X)matrix(c(rep(0,24),-1),nrow=5,byrow=T)) X0_restrict_diff <- matrix(c(0,0,0,1,1),nrow=5) V0_restrict_diff <- matrix(0,nrow=5,ncol=5)
We compare RMS for males and females in the flchain data set in the survival package. Specifying hazmatrix:
aa_male = aalen(Surv(futime,death==1)~1, data = flchain[flchain$sex=="M",]) aa_female = aalen(Surv(futime,death==1)~1, data = flchain[flchain$sex=="F",]) tms <- sort(unique(c(fineTimes,aa_male$cum[,1], aa_female$cum[,1]))) mt_male = match(aa_male$cum[,1],tms) mt_female = match(aa_female$cum[,1],tms) hazMatrix <- matrix(0,nrow=3,ncol=length(tms)) hazMatrix[1,] <- diff(c(0,tms)) hazMatrix[2,mt_male] <- diff(c(0,aa_male$cum[,2])) hazMatrix[3,mt_female] <- diff(c(0,aa_female$cum[,2])) plugEst <- pluginEstimate(1, hazMatrix, F_restrict_diff, F_restrict_diff_JacobianList,X0_restrict_diff,V0_restrict_diff,isLebesgue = 1) param = list(plugEst=plugEst, tms=tms)
Plotting $\hat{RD} = \hat R^1 - \hat R^2$ with 95\% confidence intervals:
ylims = c(1.1*min(param$plugEst$X[3,] - 1.96 * sqrt(param$plugEst$covariance[3,3,])), 1.1*max(param$plugEst$X[3,] + 1.96 * sqrt(param$plugEst$covariance[3,3,]))) plot(param$tms,param$plugEst$X[3,],type="s",ylim=ylims,ylab="",main="RMS difference",xlab="years") lines(param$tms,param$plugEst$X[3,] + 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2) lines(param$tms,param$plugEst$X[3,] - 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2) abline(a=0,b=0,col=2)
Consider a similar example with differences of cumulative incidence functions
\begin{equation} \begin{pmatrix} S^a_t \ S_t^b \ C_t^{a,1} \ C_t^{a,2} \ C_t^{b,1} \ C_t^{b,2} \ CD_t^1 \ CD_t^2 \end{pmatrix} = \begin{pmatrix} 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \end{pmatrix} + \int_0^t \begin{pmatrix} -S_s^a & -S_s^a & 0 & 0 \ 0 & 0 & -S_s^b & -S_s^b \ S_s^a & 0 & 0 & 0 \ 0 & S_s^a & 0 & 0 \ 0 & 0 & S_s^b & 0 \ 0 & 0 & 0 & S_s^b \ S_s^a & 0 & -S_s^b & 0 \ 0 & S_s^a & 0 & -S_s^b \end{pmatrix} d \begin{pmatrix} A^{a,1}_s \ A^{a,2}_s \ A^{b,1}_s \ A^{b,2}_s \end{pmatrix}, \end{equation} where $S^a$ is the survival in group $a$ and $S^b$ is the survival in group $b$, and $C^{a,j}$ is the cumulative incidence for cause $j$ in group $a$ and similarly for group $b$. $CD^i = C^{a,i} - C^{b,i}$ is the cumulative incidence difference for cause $i$. We specify the ODE system.
F_cuminc_diff <- function(X)matrix( c(-X[1],-X[1],0,0, 0,0,-X[2],-X[2], X[1],0,0,0, 0,X[1],0,0, 0,0,X[2],0, 0,0,0,X[2], X[1],0,-X[2],0, 0,X[1],0,-X[2]) ,nrow=8,byrow=T) F_cuminc_diff_JacobianList <- list(function(X)matrix(c(-1, rep(0,7), rep(0,8), 1, rep(0,7), rep(0,8), rep(0,8), rep(0,8), 1, rep(0,7), rep(0,8)),nrow=8,byrow=T), function(X)matrix(c(-1, rep(0,7), rep(0,8), rep(0,8), 1, rep(0,7), rep(0,8), rep(0,8), rep(0,8), 1, rep(0,7)),nrow=8,byrow=T), function(X)matrix(c(rep(0,8), 0,-1, rep(0,6), rep(0,8), rep(0,8), 0,1, rep(0,6), rep(0,8), 0,-1, rep(0,6), rep(0,8)),nrow=8,byrow=T), function(X)matrix(c(rep(0,8), 0,-1, rep(0,6), rep(0,8), rep(0,8), rep(0,8), 0,1, rep(0,6), rep(0,8), 0,-1, rep(0,6)),nrow=8,byrow=T)) X0_cuminc_diff <- matrix(c(1,1,0,0,0,0,0,0),nrow=8) v0_cuminc_diff <- matrix(0,nrow=8,ncol=8)
We inspect cumulative incidences of the competing events PCM, and death without malignancy, in the mgus2 data set. We compare males and females in this data set. Specifying hazmatrix:
mgus2$etime <- with(mgus2, ifelse(pstat==0, futime, ptime)) event <- with(mgus2, ifelse(pstat==0, 2*death, 1)) mgus2$event <- factor(event, 0:2, labels=c("censor", "pcm", "death")) fr_M <- mgus2[mgus2$sex == "M",] fr_M$time <- fr_M$etime/12 fr_F <- mgus2[mgus2$sex == "F",] fr_F$time <- fr_F$etime/12 fit_PCM_M = aalen(Surv(time,event=="pcm")~1,data=fr_M) fit_death_M = aalen(Surv(time,event=="death")~1,data=fr_M) fit_PCM_F = aalen(Surv(time,event=="pcm")~1,data=fr_F) fit_death_F = aalen(Surv(time,event=="death")~1,data=fr_F) tms2 = sort(unique(c(fit_PCM_M$cum[,1],fit_PCM_F$cum[,1], fit_death_M$cum[,1],fit_death_F$cum[,1]))) dA_PCM_M = dA_death_M = dA_PCM_F = dA_death_F = rep(0, length(tms2)) dA_PCM_M[match(fit_PCM_M$cum[,1], tms2)] = diff(c(0,fit_PCM_M$cum[,2])) dA_PCM_F[match(fit_PCM_F$cum[,1], tms2)] = diff(c(0,fit_PCM_F$cum[,2])) dA_death_M[match(fit_death_M$cum[,1], tms2)] = diff(c(0,fit_death_M$cum[,2])) dA_death_F[match(fit_death_F$cum[,1], tms2)] = diff(c(0,fit_death_F$cum[,2])) hazMatrix <- matrix(0,nrow=4,ncol=length(tms2)) hazMatrix[1,] <- dA_PCM_M hazMatrix[2,] <- dA_death_M hazMatrix[3,] <- dA_PCM_F hazMatrix[4,] <- dA_death_F plugEst <- pluginEstimate(1, hazMatrix, F_cuminc_diff, F_cuminc_diff_JacobianList,X0_cuminc_diff,v0_cuminc_diff) param = list(plugEst=plugEst, tms=tms2)
Plotting $\hat{CD}^1$ and $\hat{CD}^2$ with 95\% confidence intervals:
# Colors male_color <- "red" female_color <- "blue" diff_color <- "gray" par(mfrow=c(1,2)) # Plot with colors and labels plot(param$tms, param$plugEst$X[3,],type="s",ylim=c(-0.2,0.9),main="Cum.inc. PCM", ylab="",xlab="Years", col=male_color) lines(param$tms, param$plugEst$X[3,] + 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2, col=male_color) lines(param$tms, param$plugEst$X[3,] - 1.96 * sqrt(param$plugEst$covariance[3,3,]),type="s",lty=2, col=male_color) lines(param$tms, param$plugEst$X[5,],type="s", col=female_color) lines(param$tms, param$plugEst$X[5,] + 1.96 * sqrt(param$plugEst$covariance[5,5,]),type="s",lty=2, col=female_color) lines(param$tms, param$plugEst$X[5,] - 1.96 * sqrt(param$plugEst$covariance[5,5,]),type="s",lty=2, col=female_color) lines(param$tms, param$plugEst$X[7,],type="s", col=diff_color) lines(param$tms, param$plugEst$X[7,] + 1.96 * sqrt(param$plugEst$covariance[7,7,]),type="s",lty=2, col=diff_color) lines(param$tms, param$plugEst$X[7,] - 1.96 * sqrt(param$plugEst$covariance[7,7,]),type="s",lty=2, col=diff_color) abline(a=0,b=0,col=2) # Add legend legend("topleft", legend = c("Males", "Females", "Difference"), col = c(male_color, female_color, diff_color), lty = 1,bty="n") # Plot with colors and labels plot(param$tms, param$plugEst$X[4,],type="s",ylim=c(-0.2,0.9),main="Cum.inc. death", ylab="",xlab="Years", col=male_color) lines(param$tms, param$plugEst$X[4,] + 1.96 * sqrt(param$plugEst$covariance[4,4,]),type="s",lty=2, col=male_color) lines(param$tms, param$plugEst$X[4,] - 1.96 * sqrt(param$plugEst$covariance[4,4,]),type="s",lty=2, col=male_color) lines(param$tms, param$plugEst$X[6,],type="s", col=female_color) lines(param$tms, param$plugEst$X[6,] + 1.96 * sqrt(param$plugEst$covariance[6,6,]),type="s",lty=2, col=female_color) lines(param$tms, param$plugEst$X[6,] - 1.96 * sqrt(param$plugEst$covariance[6,6,]),type="s",lty=2, col=female_color) lines(param$tms, param$plugEst$X[8,],type="s", col=diff_color) lines(param$tms, param$plugEst$X[8,] + 1.96 * sqrt(param$plugEst$covariance[8,8,]),type="s",lty=2, col=diff_color) lines(param$tms, param$plugEst$X[8,] - 1.96 * sqrt(param$plugEst$covariance[8,8,]),type="s",lty=2, col=diff_color) abline(a=0,b=0,col=2) # Add legend legend("topleft", legend = c("Males", "Females", "Difference"), col = c(male_color, female_color, diff_color), lty = 1,bty="n")
[^fn1]: [Transforming cumulative hazard estimates] (Biometrika, 2018). [^fn2]: [On null hypothesis in survival analysis] (Biometrics, 2019)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.