knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(dplyr) library(ggplot2) library(survival) library(survParamSim) set.seed(12345)
The goal of survParamSim is to perform survival simulation with parametric survival model generated from 'survreg' function in 'survival' package. In each simulation, coefficients are resampled from variance-covariance matrix of parameter estimates, in order to capture uncertainty in model parameters.
survreg()
functionBefore running a parametric survival simulation, you need to fit a model to your data using survreg()
function of survival
package.
In this vignette, we will be using colon
dataset available in survival
package, where the treatment effect of adjuvant Levamisole+5-FU for colon cancer over placebo is evaluated.
First, we load the data and do some data wrangling.
# ref for dataset https://vincentarelbundock.github.io/Rdatasets/doc/survival/colon.html colon2 <- as_tibble(colon) %>% # recurrence only and not including Lev alone arm filter(rx != "Lev", etype == 1) %>% # Same definition as Lin et al, 1994 mutate(rx = factor(rx, levels = c("Obs", "Lev+5FU")), depth = as.numeric(extent <= 2))
Shown below are Kaplan-Meier curves for visually checking the data.
The second plot is looking at how many censoring we have over time.
Looks like we have a fairly uniform censoring between 1800 to 3000 days - this is attributable to a steady clinical study enrollment.
survfit.colon <- survfit(Surv(time, status) ~ rx, data = colon2) survminer::ggsurvplot(survfit.colon) survfit.colon.censor <- survfit(Surv(time, 1-status) ~ rx, data = colon2) survminer::ggsurvplot(survfit.colon.censor)
Next we fit a lognormal parametric model for the data.
Here we are using node4
and depth
as additional covariates in addition to treatment (rx
).
You can see that all of the factor has strong association with the outcome.
fit.colon <- survreg(Surv(time, status) ~ rx + node4 + depth, data = colon2, dist = "lognormal") summary(fit.colon)
surv_param_sim()
is the main function of the package that takes survreg
object as described above.
It also require you to supply newdata
, which is required even if it is not new - i.e. the same data was used for both survreg()
and surv_param_sim()
.
What it does is:
1. Re-sample all the coefficients in the parametric survival model from variance-covariance matrix for n.rep
times.
2. Perform survival time for all subjects in newdata
with the corresponding covariates, using one of the resampled coefficients. Also generate censoring time according to censor.dur
(if not NULL), and replace the simulated survival time above if censoring time is earlier.
4. Repeat the steps 2. for n.rep
times.
sim <- surv_param_sim(object = fit.colon, newdata = colon2, # Simulate censoring according to the plot above censor.dur = c(1800, 3000), # Simulate only 100 times to make the example go fast n.rep = 100)
After the simulation is performed, you can either extract raw simulation results or further calculate Kaplan-Meier estimates or hazard ratio of treatment effect, as you see when you type sim
in the console.
sim
To calculate survival curves for each simulated dataset, calc_km_pi()
can be used on the simulated object above.
km.pi <- calc_km_pi(sim, trt = "rx") km.pi
Similar to the raw simulated object, you can have a few options for further processing - one of them is plotting prediction intervals with plot_km_pi()
function.
plot_km_pi(km.pi) + theme(legend.position = "bottom") + labs(y = "Recurrence free rate") + expand_limits(y = 0)
Or providing median survival summary table with extract_medsurv_pi()
function.
extract_medsurv_pi(km.pi)
Plot can also be made for subgroups.
You can see that prediction interval is wide for (depth: 1 & nodes4: 1) group, mainly due to small number of subjects
km.pi <- calc_km_pi(sim, trt = "rx", group = c("node4", "depth")) plot_km_pi(km.pi) + theme(legend.position = "bottom") + labs(y = "Recurrence free rate") + expand_limits(y = 0)
To calculate prediction intervals of HRs, calc_hr_pi()
can be used on the simulated object above.
Here I only generated subgroups based on "depth", since the very small N in (depth: 1 & nodes4: 1) can cause issue with calculating HRs.
hr.pi <- calc_hr_pi(sim, trt = "rx", group = c("depth")) hr.pi plot_hr_pi(hr.pi)
You can also extract prediction intervals and observed HR with extract_hr_pi()
function.
extract_hr_pi(hr.pi)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.