The goal of PiecewiseChangepoint is to estimate the number and locations of change-points in piecewise exponential models.
You can install the released version of PiecewiseChangepoint from GitHub with:
devtools::install_github("Anon19820/PiecewiseChangepoint")
In order to run some of the functions JAGS and Stan are required along with RTools.
First we load the package and simulate some piecewise exponential data.
library("PiecewiseChangepoint")
## simulated example
set.seed(123)
n_obs =300
n_events_req=300
max_time = 24 # months
rate = c(0.75,0.25)/12 # we want to report on months
t_change =12 # change-point at 12 months
df <- gen_piece_df(n_obs = n_obs,n_events_req = n_events_req,
num.breaks = length(t_change),rate = rate ,
t_change = t_change, max_time = max_time)
We see the output of this dataframe below:
## time_event status time
## 24 0.09194727 1 0.09194727
## 193 0.23141129 1 0.23141129
## 87 0.24251702 1 0.24251702
## 126 0.25450622 1 0.25450622
## 297 0.28833655 1 0.28833655
## 139 0.32615105 1 0.32615105
For this simulated dataset; time_event represents the time the event would occur at in the absence of censoring, while time is minimum of the censoring time and the event time. status is an indicator variable if the event occurred at the corresponding time or if it was censored. Plotting this survival function we see a potential change in the hazard at around year 1.
As noted in (Bagust and Beale 2014), constant hazards are linear with respect to the cumulative hazard function, therefore, the change in hazards at approximately 12 months can be seen more clearly in this plot.
ggsurvplot(fit, palette = "#2E9FDF", fun = "cumhaz")
Next we fit the model noting that only the time and status columns are required. The timescale argument changes the prior for the hazards $\lambda$ so that it is appropriate for the timescale. For example if the timescale is years then the a vague prior centered around 1 is appropriate (i.e. $36\%$ of population having the event each year), while if the timescale is in months the equivalent prior should have an expected value of 1/12 (and days 1/365).
Collapsing_Model <- collapsing.model(df,
n.iter = 20750,
burn_in = 750,
n.chains = 2,
timescale = "months")
As we would expect the one change-point model has the highest posterior probability.
print(Collapsing_Model)
## Posterior Change-point Probabilities:
## 0 1 2 3 4 5
## 0.000575 0.809525 0.166000 0.020650 0.002850 0.000400
##
## Summary of 1 change-point model:
##
## changepoint_1 lambda_1 lambda_2
## Min. : 1.873 Min. :0.04206 Min. :0.01170
## 1st Qu.:11.701 1st Qu.:0.05614 1st Qu.:0.02355
## Median :11.900 Median :0.05940 Median :0.02652
## Mean :11.857 Mean :0.05954 Mean :0.02682
## 3rd Qu.:12.603 3rd Qu.:0.06277 3rd Qu.:0.02983
## Max. :19.457 Max. :0.08456 Max. :0.04832
Simulations from the posterior distribution for the change-point locations and associated hazards can be extracted from the returned object (highlighted below).
Collapsing_Model$changepoint
Collapsing_Model$lambda
Once we are satisfied that there is good mixing and that we have run the model for long enough (20,000 simulations over 2 chains should be more than enough), we may want to look at a plot of the survivor function. In health economics we are typically interested in long term survival of our parametric models. In this situation we want a plot of the first 5 years which we can do using the max_predict argument (in this case 60 months). The red lines show the individual posterior simulations and are a natural representation of the parameter uncertainty.
library("ggplot2")
plot(Collapsing_Model, max_predict = 60, chng.num = 1)+xlab("Time (Months)")
Similarly we may also want to look at the hazard function. In this situation we only present the hazard up to the maximum time observed in the data. This is because by definition the hazard from the final interval will be the one which is extrapolated throughout the time horizon.
plot(Collapsing_Model, type = "hazard")+xlab("Time (Months)")+ylab("Hazards")+ylim(c(0,.1))
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
By default the plot methods described above use all the posterior simulations. If for example, we were only interested in the 2 change-point model, we can specify this using the chng.num argument. The green points indicate the mean location of the change-points. When plotting “all” of the simulations there is no sensible mean location of the change-points as there are different numbers of change-points.
plot(Collapsing_Model, max_predict = 60, chng.num = 2)+xlab("Time (Months)")
In practical health economic modelling, we require evaluation of the survival function. Assuming time $t$ is within the $j^{\text{th}}$ interval we have calculate the cumulative hazard for this interval as $\lambda_j (t - \tau_{j-1})$ with $\tau_{j-1}$ being the ${j-1}^{\text{th}}$ change-point and $\lambda_j$ the $j^{\text{th}}$ hazard. We also require the cumulative hazard for all the previous intervals which is $\sum_{g=1}^{j-1} \lambda_g(\tau_g - \tau_{g-1})$. The survival probability is the exponential of the negative of the total cumulative hazard $S(t) = \exp(-H(t))$ and is written fully as:
$$S(t) = \exp{- [\lambda_j (t - \tau_{j-1}) + \sum_{g=1}^{j-1} \lambda_g(\tau_g - \tau_{g-1}) ] }.$$
It is programmatically straightforward to calculate this and the the
function get_Surv
evaluates the survival at a vector of user specified
times. The user can specific an additional argument chng.num
if they
require survival probabilities from a particular change-point number.
St_all <- get_Surv(Collapsing_Model, time = c(0:10))
St_all <- get_Surv(Collapsing_Model, time = c(0:10), chng.num = 1)
Given that the primary
In health economics we are typically interested in picking between one of a number of alternative parametric models, although it is also possible to combine all models using model averaging (Jackson, Sharples, and Thompson 2010). Model fit statistics can provide an assessment of fit to the observed data, although, they do not guarantee the best fitting model will be appropriate for extrapolation. Nevertheless, we can compare our fitted model with 6 commonly used parametric models along with Royston-Parmar spline models. We fit the models using the JAGS (Plummer 2003) and Stan (RStan.2023?) and compare the model fit using Widely Applicable Information Criterion (WAIC) (Watanabe 2010).
Including General Population Mortality (GPM) is required to ensure that the extrapolated hazards are consistent with the increasing hazards associated with advanced ageing. Adjustments for GPM is typically done within the cost-effectiveness model, however, we can include them directly at the analysis stage so that we see their impact on the extrapolated survival.
In this example we consider GPM from a UK data source which provides mortality rates, defined as “the probability of that a person aged exactly $x$ will die before reaching $x+1$. Therefore, this data source provides the conditional probability of death within a year at each age.
Assuming our population is $50%$ male and female and the age at baseline is 55 years we have the following conditional probabilities of death at each age:
age_baseline_example <- 55
prop_male <- 0.5
time_horizon <- 100
Conditional_Death_df <- read.xlsx("Examples/Conditional_Death_UK.xlsx", 1) %>%
filter(age >=age_baseline_example)
head(Conditional_Death_df)
## age Males..2018.2020. Females.2018.2020
## 1 55 0.005046 0.003283
## 2 56 0.005593 0.003637
## 3 57 0.006060 0.003928
## 4 58 0.006695 0.004367
## 5 59 0.007239 0.004707
## 6 60 0.007912 0.005247
Our timescale is months and we need to convert this annual probability to a monthly rate which is done using the following formula (assuming a constant rate of mortality) (Fleurence and Hollenbeak 2007):
$$r = \frac{1}{t}\ln(1-p).$$ Because there are 12 months in a year $t = 12$ and $p$ is the specific (in our case annual) probability of death. With the below R code we now have the monthly rate of death for ages 55 (our assumed starting age of the cohort) up to 100 years of age, adjusted for distribution of males and females.
time_factor <- 12
df_temp <- Conditional_Death_df
df_temp[, "mix_prob"] <- df_temp[,2]*prop_male + df_temp[,3]*(1-prop_male)
df_temp <- df_temp %>% filter(age >= age_baseline_example & age <= time_horizon)
df_temp$mix_haz <- -log(1-df_temp$mix_prob)/time_factor
gmp_haz_vec_example = rep(df_temp$mix_haz,each = time_factor)
#We now have the hazard at each timepoint
gmp_haz_df_example <- data.frame(time = 1:length(gmp_haz_vec_example),
hazard = gmp_haz_vec_example)
Within the compare.surv.mods
function the cumulative hazard of death
(associated with GPM) and cumulative hazard of an event (from the
parametric model) is added to obtain the overall cumulative hazard
$H(t)$. The cumulative hazard is the sum (in the case of discrete
hazards as above) of the individual hazards and the integral of the
parametric hazards. As noted in the previous section survival
probabilities are obtained through the relation $S(t) = \exp(-H(t))$. By
default the compare.surv.mods
function only implements GPM hazards
after follow-up as we observe survival from all causes up until then
(although GPM hazards can be added from start of follow-up by using the
gpm_post_data = FALSE
).
We see in the plot below that including the GPM hazard ensures that the extrapolated hazard exhibits the characteristic increasing hazards associated with ageing.
Extrapolated hazards using from Piecewise Exponential including GPM
The fitting of other parametric models is accomplished by the
compare.surv.mods
and general population mortality is adjusted for by
including a gmp_haz_df
as described above. Fitted models include:
Model fit to the observed data and a plot of the extrapolated survival
are available from within the mod_comp
object along with the posterior
samples from all of the fitted models.
#This can take a number of minutes
set.seed(123)
mod_comp <- compare.surv.mods(Collapsing_Model,
max_predict = 100, #100 months
n.iter.jags = 5000, #Run JAGS/Stan for 5000 samples
n.thin.jags = 1,
n.burnin.jags = 500,
chng.num = 1, #Using results from 1 change-point PEM
gmp_haz_df =gmp_haz_df_example) #GPM dataset
mod_comp$mod.comp[,c(1,3)]
#Returns a dataframe with the model fit results
mod_comp$mod.comp[,c(1,3)] %>% arrange(WAIC)
## Model WAIC
## 1 Piecewise Exponential 1547.599
## 2 Log-Normal 1552.492
## 3 Log-Logistic 1553.213
## 4 Gompertz 1553.266
## 5 Royston-Parmar 2 knot 1553.743
## 6 Generalized Gamma 1556.875
## 7 Weibull 1561.850
## 8 Gamma 1564.098
## 9 Exponential 1568.053
mod_comp$plot_Surv_all
## NULL
Extrapolated Survival Probability
Because this is simulated data we know the actual event times and for the purpose of illustration we plot this so that we can compare how the predictions match the data. As expected the piecewise exponential model provides an accurate fit to the data across the time-horizon.
#We have the actual event times contained with df
df_true <- df
df_true$time <- df_true$time_event
df_true$status <- 1
df_true <- df_true %>% mutate(time = ifelse(time >10, 10, time))
df_true <- df_true %>% mutate(status = ifelse(time >=10, 0, status))
#Plot the data
add_km(mod_comp$plot_Surv_all, df_true, colour = "black")
One criticism of methods which employ marginal likelihood for the purpose of model selection is that marginal likelihood is sensitive to the prior. This is distinct from the posterior distribution of the parameters which in the presence of sufficient data will dominate a (suitably vague) prior.
The prior on the hazard is $\lambda \sim \mathcal{G}(\alpha, \beta)$ where $\alpha,\beta$ are the shape and rate of the Gamma distribution. To improve the robustness of the results and incorporate uncertainty from this parameter we can introduce hyperpriors on $\beta$ which is also assumed to be generated from a gamma distribution $\beta \sim \mathcal{G}(\xi, \delta)$.
However, to ensure that there is no discrepancy between the model selection using two different timescales, we need to take care when selecting $\alpha,\xi \text{ and } \delta$. We strongly recommend to keep $\alpha, \xi$ at their default values of 1, which results in an exponential prior distribution for both $\lambda, \beta$. Therefore the only value which will need to change is $\delta$. When we are changing timescales from years to days we need the gamma prior for $\beta$ to be scaled by 1/365 (i.e. the number of days in a year). Owing to the properties of the gamma distribution the equivalent scaled distribution is a $\mathcal{G}(1,1/365)$. When this prior is used we obtain very similar (differences due to Monte-Carlo error) posterior change-point probabilities. Although we suggest that $\xi$ (and $\alpha$) should be set to 1 we see from the below plot that a $\mathcal{G}(1,1)$ distribution is similar to $\mathcal{G}(2,2)$ distribution. Both have an expected value of 1 with variances 1 and 0.5 respectively and will give very similar inferences. However, it should be clear that both distributions are different to the $\mathcal{G}(0.2,0.2)$ distribution. Even in the presence of these different priors the posterior probability for the one change-point model is $\approx 70\%$ for the simulated dataset introduced in the first section.
These choices are automatically implemented in function
collapsing.model
by specifying the timescale
as days, months or
years.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.