library(knitr) opts_chunk$set( fig.align = "center", cache = TRUE, message = FALSE, fig.height = 5, fig.width = 5, dpi = 300 ) library(ggplot2) theme_set(theme_bw())
In this vignette, we show examples of how to estimate and visualize transition probabilities for multi-state data. First, we show that a baseline PAMMs model returns equal probabilities as the empirical transition probabilities stemming from the Aalen-Johannsen estimator. Second, we show that the estimated baseline model in a multi-state setting with backtransitions using PAMMs is equal to a Cox proportional hazard model. Third, we illustrate the application of PAMMs in the analysis of the non-linear effect of hemoglobin on the probabilities to transition to a plasma cell malignancy (PCM) and / or death.
The data transformation required to fit PAMMs to multi-state data is similar to the transformation in the single event case (see the data transformation vignette for details). In fact, internally the standard transformation is applied to each event type using as_ped
, however, some choices have to be made
prothr
Data - Time to abnormal prothrombin levels in liver cirrhosisFor illustration, we use the prothr
data set from the mstate
package.
- 488 liver cirrhosis patients
- endpoints: status
= 0/1 for censoring, event
- events of interest: status
= 1 and to
= 2/3 for abnormal prothrombin level (patient can transition between normal and abnormal prothrombin level back and forth), death, tstart
, tend
as time-to-event
- variable of interest treat
: A patient's treatment (Placebo, Prednisone)
library(survival) library(mgcv) library(ggplot2) library(pammtools) library(purrr) library(mstate) library(checkmate) library(dplyr) data(prothr, package = "mstate") prothr |> filter(id == 46) |> knitr::kable() # example patients
In general, one has to follow three steps to derive transition probabilities from multi-state survival data.
- First, we need to transform the survival data prothr
into piecewise exponential data.
- Second, we need to estimate the log hazard structure using PAM objects.
- Third, we need to post-process the data to include all relevant objects of interest in our data set.
Transforming the survival data prothr
into piecewise exponential data, we can use
# load function as it is not yet included in the package # devtools::load_all("C:/Users/ra63liw/Documents/98_git/pammtools-multi-state/pammtools") # devtools::install("C:/Users/ra63liw/Documents/98_git/pammtools-multi-state/pammtools") # source("C:/Users/ra63liw/Documents/98_git/pammtools-multi-state/pammtools/R/add-functions.R") # source("C:/Users/ra63liw/Documents/98_git/pammtools-multi-state/pammtools/tmp/add_transition_probabilities.R") library(checkmate)
# not necessary, prothr already contains all possible transitions # my.prothr <- prothr |> add_counterfactual_transitions() # add possible transitions data("prothr", package = "mstate") prothr <- prothr |> mutate(transition = as.factor(paste0(from, "->", to)) , treat = as.factor(treat)) |> rename(tstart = Tstart, tstop = Tstop) |> filter(tstart != tstop) |> select(-trans) ped <- as_ped( data = prothr, formula = Surv(tstart, tstop, status)~ ., transition = "transition", id = "id", timescale = "calendar", tdc_specials="concurrent" )
where add_counterfactual_transitions
is a helper function, which adds all possible transitions at each point in time.
Estimating the log hazard structure using PAM objects, we can use
# pam <- pamm(ped_status ~ s(tend, by=transition) + transition * treat, data = ped) pam <- bam(ped_status ~ s(tend, by=transition) + transition * treat , data = ped , family = poisson() , offset = offset , method = "fREML" , discrete = TRUE) summary(pam)
Post-processing the data to include all relevant objects of interest in our data set, we can use
ndf <- make_newdata(ped, tend = unique(tend), treat = unique(treat), transition = unique(transition)) ndf <- ndf |> group_by(treat, transition) |> # important! arrange(treat, transition, tend) |> add_trans_prob(pam, ci=TRUE) ndf <- ndf |> group_by(treat, transition) |> # important! add_cumu_hazard(pam, overwrite = T)
where make_newdata
creates a data set containing all covariates and all their combinations from the PAM object. The convenience function add_trans_prob
crates a new column trans_prob
, which can be visualized.
# visualization ggplot(ndf, aes(x=tend)) + geom_line(aes(y=trans_prob, col=treat)) + geom_ribbon(aes(ymin = trans_lower, ymax = trans_upper, fill=treat), alpha = .3) + scale_color_manual(values = c("firebrick2" , "steelblue") )+ scale_fill_manual(values = c("firebrick2" , "steelblue") )+ facet_wrap(~transition) + xlim(c(0, 4000)) + ylim(c(0,1))+ labs(y = "Transition Probability", x = "time", color = "Treatment", fill= "Treatment")
Comparing the pammtools
results with the mstate
results, we want to validate that the baselines are indeed correct. The following code shows the comparison between the mstate
.
First, we compare the cumulative hazards
# pammtools ndf <- ndf |> mutate(package = "pammtools")
library(mstate) library(msm) library(mvna) library(etm) # code from mstate documentation data(prothr, package = "mstate") tmat <- attr(prothr, "trans") pr0 <- subset(prothr, treat=="Placebo") attr(pr0, "trans") <- tmat pr1 <- subset(prothr, treat=="Prednisone") attr(pr1, "trans") <- tmat c0 <- coxph(Surv(Tstart, Tstop, status) ~ strata(trans), data=pr0) c1 <- coxph(Surv(Tstart, Tstop, status) ~ strata(trans), data=pr1) msf0 <- msfit(c0, trans=tmat) msf1 <- msfit(c1, trans=tmat) # plot hazards mstate_dat0 <- msf0$Haz %>% mutate(transition = case_when( trans == 1 ~ "1->2", trans == 2 ~ "1->3", trans == 3 ~ "2->1", trans == 4 ~ "2->3", .default = "cens" ) , treat = "Placebo") mstate_dat1 <- msf1$Haz %>% mutate(transition = case_when( trans == 1 ~ "1->2", trans == 2 ~ "1->3", trans == 3 ~ "2->1", trans == 4 ~ "2->3", .default = "cens" ) , treat = "Prednisone") mstate_dat <- rbind(mstate_dat0, mstate_dat1) long_mstate <- mstate_dat %>% rename(tend = time, cumu_hazard = Haz) %>% mutate(package = "mstate") %>% select(tend, treat, transition, cumu_hazard, package) long_msm <- ndf %>% mutate(package = "pammtools") %>% select(tend, treat, transition, cumu_hazard, package) long_haz_df <- rbind(long_mstate, long_msm) %>% mutate(transition = case_when(transition == "1->2" ~ "0->1" , transition == "1->3" ~ "0->2" , transition == "2->1" ~ "1->0" , transition == "2->3" ~ "1->2") ) # compare transition probabilities alpha <- 0.05 pt0 <- probtrans(msf0, predt=0) # changed predt from 1000 to 0 pt1 <- probtrans(msf1, predt=0) # changed predt from 1000 to 0 pt0_df_12 <- data.frame(cbind(tend = pt0[[1]]$time , trans_prob = pt0[[1]]$pstate2 , trans_lower = pt0[[1]]$pstate2 - pt0[[1]]$se2 * qnorm(1 - alpha / 2) , trans_upper = pt0[[1]]$pstate2 + pt0[[1]]$se2 * qnorm(1 - alpha / 2)) ) %>% mutate(transition = "1->2") pt0_df_13 <- data.frame(cbind(tend = pt0[[1]]$time , trans_prob = pt0[[1]]$pstate3 , trans_lower = pt0[[1]]$pstate3 - pt0[[1]]$se3 * qnorm(1 - alpha / 2) , trans_upper = pt0[[1]]$pstate3 + pt0[[1]]$se3 * qnorm(1 - alpha / 2) )) %>% mutate(transition = "1->3") pt0_df_21 <- data.frame(cbind(tend = pt0[[2]]$time , trans_prob = pt0[[2]]$pstate1 , trans_lower = pt0[[2]]$pstate1 - pt0[[2]]$se1 * qnorm(1 - alpha / 2) , trans_upper = pt0[[2]]$pstate1 + pt0[[2]]$se1 * qnorm(1 - alpha / 2) )) %>% mutate(transition = "2->1") pt0_df_23 <- data.frame(cbind(tend = pt0[[2]]$time , trans_prob = pt0[[2]]$pstate3 , trans_lower = pt0[[2]]$pstate3 - pt0[[2]]$se3 * qnorm(1 - alpha / 2) , trans_upper = pt0[[2]]$pstate3 + pt0[[2]]$se3 * qnorm(1 - alpha / 2) )) %>% mutate(transition = "2->3") pt0_df <- rbind(pt0_df_12, pt0_df_13, pt0_df_21, pt0_df_23) %>% mutate(treat = "Placebo", package = "mstate") pt1_df_12 <- data.frame(cbind(tend = pt1[[1]]$time , trans_prob = pt1[[1]]$pstate2 , trans_lower = pt1[[1]]$pstate2 - pt1[[1]]$se2 * qnorm(1 - alpha / 2) , trans_upper = pt1[[1]]$pstate2 + pt1[[1]]$se2 * qnorm(1 - alpha / 2)) ) %>% mutate(transition = "1->2") pt1_df_13 <- data.frame(cbind(tend = pt1[[1]]$time , trans_prob = pt1[[1]]$pstate3 , trans_lower = pt1[[1]]$pstate3 - pt1[[1]]$se3 * qnorm(1 - alpha / 2) , trans_upper = pt1[[1]]$pstate3 + pt1[[1]]$se3 * qnorm(1 - alpha / 2) )) %>% mutate(transition = "1->3") pt1_df_21 <- data.frame(cbind(tend = pt1[[2]]$time , trans_prob = pt1[[2]]$pstate1 , trans_lower = pt1[[2]]$pstate1 - pt1[[2]]$se1 * qnorm(1 - alpha / 2) , trans_upper = pt1[[2]]$pstate1 + pt1[[2]]$se1 * qnorm(1 - alpha / 2) )) %>% mutate(transition = "2->1") pt1_df_23 <- data.frame(cbind(tend = pt1[[2]]$time , trans_prob = pt1[[2]]$pstate3 , trans_lower = pt1[[2]]$pstate3 - pt1[[2]]$se3 * qnorm(1 - alpha / 2) , trans_upper = pt1[[2]]$pstate3 + pt1[[2]]$se3 * qnorm(1 - alpha / 2) )) %>% mutate(transition = "2->3") pt1_df <- rbind(pt1_df_12, pt1_df_13, pt1_df_21, pt1_df_23) %>% mutate(treat = "Prednisone", package = "mstate") pt_df_mstate <- rbind(pt0_df, pt1_df) pt_df_pamm <- ndf %>% mutate(package = "pammtools") %>% select(tend, treat, transition, trans_prob, trans_lower, trans_upper, package) pt_df <- rbind(pt_df_pamm, pt_df_mstate)
# # plot transitions # ggplot(test_msm, aes(x=tend, y=trans_prob)) + # geom_line(aes(col=as.factor(treat))) + # facet_wrap(~transition, ncol = 2, labeller = label_both) + # # scale_color_manual(values = c("#1f78b4", "#1f78b4", "#33a02c", "#33a02c"))+ # # scale_linetype_manual(values = c("solid", "dashed", "solid", "dashed")) + # ylim(c(0,0.8)) + # xlim(c(0, 4000)) + # ylab("Transition Probability") + # xlab("time") + # theme_bw() comparison_nelaal <- ggplot(long_haz_df, aes(x=tend, y=cumu_hazard, col=treat, linetype = package)) + geom_line() + facet_wrap(~transition, ncol = 4, labeller = label_both) + scale_color_manual(values = c("firebrick2" , "steelblue") )+ # scale_linetype_manual(values = c("solid", "dashed", "solid", "dashed")) +18: ylab("Cumulative Hazards") + xlab("time in days") + ylim(c(0,6)) + scale_linetype_manual(values=c("dotted", "solid")) + theme_bw() comparison_nelaal
Second, we compare the transition probabilities, which are calculated using the cumulative hazards
comparison_aaljoh <- ggplot(pt_df, aes(x=tend, y=trans_prob, col=treat, linetype = package)) + geom_line() + # geom_ribbon(aes(ymin = trans_lower, ymax = trans_upper, fill=treat), alpha = .3) + facet_wrap(~transition, ncol = 4, labeller = label_both) + scale_color_manual(values = c("firebrick2" , "steelblue") )+ # scale_linetype_manual(values = c("solid", "dashed", "solid", "dashed")) +18: ylab("Transition Probabilities") + xlab("time in days") + ylim(c(0,1)) + xlim(c(0, 4000)) + scale_linetype_manual(values=c("dotted", "solid")) + theme_bw() comparison_aaljoh
Third, we compare the width of the confidence bands
comparison_aaljoh_ci <- ggplot(pt_df, aes(x=tend, y=trans_prob, col=interaction(package, treat), linetype = package)) + geom_line() + geom_ribbon(aes(ymin = trans_lower, ymax = trans_upper, fill=interaction(package, treat), alpha = package)) + facet_wrap(~treat+transition, ncol = 4, labeller = label_both) + scale_color_manual(values = c("mstate.Placebo" = "grey" ,"mstate.Prednisone" = "grey" , "pammtools.Placebo" = "firebrick2" , "pammtools.Prednisone" = "steelblue") )+ scale_fill_manual(values = c("mstate.Placebo" = "grey" ,"mstate.Prednisone" = "grey" , "pammtools.Placebo" = "firebrick2" , "pammtools.Prednisone" = "steelblue") )+ scale_alpha_manual(values = c("mstate" = 0.3, "pammtools" = 0.05))+ # scale_linetype_manual(values = c("solid", "dashed", "solid", "dashed")) +18: ylab("Transition Probabilities") + xlab("time in days") + ylim(c(0,1)) + xlim(c(0, 4000)) + scale_linetype_manual(values=c("dotted", "solid")) + theme_bw()+ guides(color = "none", fill = "none", alpha = "none", linetype = "none") comparison_aaljoh_ci
comparison_aaljoh_ci <- ggplot(pt_df, aes(x=tend, y=trans_prob, col=interaction(package, treat), linetype = package)) + geom_line(linetype = "solid") + geom_line(aes(y=trans_lower), linetype = "dotted") + geom_line(aes(y=trans_upper), linetype = "dotted") + geom_ribbon(data = subset(pt_df, package == "mstate"), aes(ymin = trans_lower, ymax = trans_upper), fill = "grey", alpha = 0.3) + facet_wrap(~treat+transition, ncol = 4, labeller = label_both) + ylab("Transition Probabilities") + xlab("time in days") + ylim(c(0,1)) + xlim(c(0, 4000)) + scale_color_manual(values = c("mstate.Placebo" = "grey" ,"mstate.Prednisone" = "grey" , "pammtools.Placebo" = "firebrick2" , "pammtools.Prednisone" = "steelblue") )+ theme_bw()+ guides(color = "none", fill = "none", linetype = "none") comparison_aaljoh_ci
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.