Analysing data with psborrow

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This vignette provides additional details of the analysis model that is fitted as well as a demonstration of how to use the package to fit said models to an existing dataset. For examples of how to fit the analysis model to the simulated datasets generated by the package itself, please see the user-guide vignette.

The "Analysis Model" section of this vignette is a modified excerpt from Lu, Y. et al. (2021).

Analysis Model

The package uses Markov chain Monte Carlo (MCMC) through JAGS to obtain the posterior distribution for the hazard ratio between treatment and control arms. The package relies on the Bayesian commensurate prior approach (Lewis, et al., 2019). The probability density function for the survival time of patient $i$ following a Weibull model is as below:

$$ t_i \sim Weibull( \nu, \lambda_i) $$

for $i = 1, 2, \dots,n$, where:

The probability density function is:

$$ f(t_i, x_i) = \nu\lambda_i t^{\nu-1} e^{-\lambda_it^\nu} $$

with the rate taking the format:

$$ \lambda_i = exp(\alpha_i + \beta_{trt} \times I_{trt} + \beta^T \times X_{i}) $$

Where:

A hierarchical model is defined in which the log hazard rate for the concurrent controls is centered on the external controls, i.e. $\alpha_{cc} | \alpha_{ec} \sim N(\alpha_{ec}, \tau)$. The hyperprior $\tau$, also referred to as the commensurability parameter, determines the extent of borrowing (see Lewis, et al., 2019).

Four borrowing methods have been implemented:

  1. Not borrowing any external control information
  2. Fully incorporating external control arm patients to the concurrent control group
  3. Dynamic borrowing with the commensurability hyperparameter $\tau$ following a Gamma distribution
  4. Dynamic borrowing with the commensurability hyperparameter $\tau$ following a half-Cauchy distribution

For methods (1) and (2), a conventional Bayesian model without dynamic borrowing is deployed with all relevant parameters assigned a non-informative prior. For method (1) no external control arm information is used and as such $\alpha_{cc} = \alpha \sim N(0, 0.0001)$ For method (2) the external control arm is treated as part of the concurrent control and as such $\alpha_{cc} = \alpha_{ec} = \alpha \sim N(0, 0.0001)$.

In comparison, methods (3) and (4) have the commensurability hyper-parameter $\tau$, which acts as the precision variable assessing the similarity between the concurrent and external control groups. The prior of $tau$ follows a Gamma distribution (default: shape = 1, rate = 0.001) and a half-Cauchy distribution (default: location = 0, scale = 0.2), respectively, in methods 3 and 4. Users can update the default values for the parameter if they wish to customize the hyper-prior.

After users state the prior choices and the simulated data is ready for use, the Markov chain Monte Carlo (MCMC) algorithm is deployed to obtain samples from the posterior distribution.

Example of use

To demonstrate how to use the package to fit the analysis model we first generate a dataset where the time to event is drawn from the following distribution:

$$ t_i \sim Weibull(0.95, \lambda_i) $$ Where:

$$ \lambda_i = \frac{1}{150}I_{ec} \times \frac{1}{200}I_{cc} \times \frac{1}{400}I_{trt} \times exp(0.2\times x_{age} - 0.1\times I_{female}) $$

Patients are then censored by drawing a censoring time from the exponential distribution with a rate parameter of $\lambda = 1/400$

suppressPackageStartupMessages({
    library(psborrow)
    library(dplyr)
    library(flexsurv)
})

set.seed(401)

n <- 1000

grp_lvl <- c("CC", "TRT", "EC")
sex_lvl <- c("Male", "Female")

log_hr_grp <- c(
    "TRT" = log(1 / 400),
    "CC" = log(1 / 200),
    "EC" = log(1 / 150)
)

log_hr_sex <- c(
    "Male" = 0,
    "Female" = -0.1
)

log_hr_age <- 0.2

Shape <- 0.95

dat <- tibble(
    pt = 1:n,
    grp = factor(
        sample(grp_lvl, size = n, replace = TRUE, prob = c(0.3, 0.3, 0.4)),
        levels = grp_lvl
    ),
    sex = factor(
        sample(sex_lvl, size = n, replace = TRUE, prob = c(0.4, 0.6)),
        levels = sex_lvl
    ),
    age = rnorm(n, mean = 0, sd = 1),
    lambda = exp(
        log_hr_age * age +
        log_hr_sex[as.character(sex)] +
        log_hr_grp[as.character(grp)]
    ),
    time = rweibullPH(n, scale = lambda, shape = Shape),
    centime = rexp(n, 1 / 400)
) %>%
    mutate(event = ifelse(time <= centime, 1, 0)) %>%
    mutate(time = ifelse(time <= centime, time, centime)) %>% 
    select(pt, grp, age, sex, time, event)

dat

To conduct an analysis we need to use the apply_mcmc() function. However to use this function we first need to convert our data into the required format. Full details of the required format can be found in help(apply_mcmc) however an example of this is shown below:

dat2 <- dat %>%
    mutate(trt = ifelse(grp == "TRT", 1, 0)) %>%
    mutate(ext = ifelse(grp == "EC", 1, 0)) %>%
    mutate(cnsr = 1 - event) %>%
    select(pt, trt, ext, time, cnsr, sex, age)

dat2

We can now pass our data to apply_mcmc() to produce our posterior samples. Prior distributions are specified using the set_prior() function, details for which can be found in help(set_prior) as well as the user-guide vignette. Covariates for the analysis model need to be specified by a 1-sided formula provided to the formula_cov argument. Note that the treatment indicator is automatically added to the model and should not be specified in the covariate formula; if you require treatment interactions these should be specified explicitly via trt:cov rather than by trt*cov. The intercept term must always be included in this formula. You are advised to leave the pred parameter as either "all" or "ps" depending on your requirements. If you do wish to restrict the covariates included in the model then this should be done by omitting them from the formula_cov rather than the pred argument in set_prior().

postsamp <- apply_mcmc(
    dt = dat2,
    formula_cov = ~ 1 + age + sex,
    priorObj = set_prior(pred = "all", prior = "gamma", r0 = 0.85, alpha = c(0, 0)),
    n.chains = 1,
    n.adapt = 100,
    n.burn = 100,
    n.iter = 200,
    seed = 47
)

From here we can extract our posterior samples as follows:

head(extract_samples(postsamp))

For reference the returned parameters are:

If prior = "no_ext" or prior = "full_ext" then alpha[1] and alpha[2] will be replaced with a single parameter alpha representing the shape parameter for the single baseline distribution for the trail.

Additionally general summary statistics of our samples can be generated by the summary() function:

summary(postsamp)

References

  1. Lu, Y., Lin, A., Pang, H., Zhu, J. (2021). PSBORROW: BAYESIAN DYNAMIC BORROWING R TOOL FOR COMPLEX INNOVATIVE TRIAL DESIGNS. ASA Biopharmaceutical Report, 28 (3), 11-19.

  2. Lewis, C. J., Sarkar, S., Zhu, J., & Carlin, B. P. (2019). Borrowing from historical control data in cancer drug development: a cautionary tale and practical guidelines. Statistics in biopharmaceutical research, 11(1), 67-78.



Try the psborrow package in your browser

Any scripts or data that you put into this service are public.

psborrow documentation built on March 7, 2023, 8:32 p.m.