knitr::opts_chunk$set(echo = TRUE)
This vignette describes how retrieved dropout models which include time-varying intercurrent event (ICE) indicators can be
implemented in the rbmi
package.
Retrieved dropout models have been proposed for the analysis of estimands using the treatment policy strategy for addressing an ICE. In these models, missing outcomes are multiply imputed conditional upon whether they occur pre- or post-ICE. Retrieved dropout models typically rely on an extended missing-at-random (MAR) assumption, i.e., they assume that missing outcome data is similar to observed data from subjects in the same treatment group with the same observed outcome history, and the same ICE status. For a more comprehensive description and evaluation of retrieved dropout models, we refer to @Guizzaro2021, @PolverejanDragalin2020, @Noci2021, @Drury2024, and @Bell2024. Broadly, these publications find that retrieved dropout models reduce bias compared to alternative analysis approaches based on imputation under a basic MAR assumption or a reference-based missing data assumption. However, several issues of retrieved dropout models have also been highlighted. Retrieved dropout models require that enough post-ICE data is collected to inform the imputation model. Even with relatively small amounts of missingness, complex retrieved dropout models may face identifiability issues. Another drawback to these models in general is the loss of power relative to reference-based imputation methods, which becomes meaningful for post-ICE observation percentages below 50% and increases at an accelerating rate as this percentage decreases [@Bell2024].
simulate_data()
{#sec:dataSimul}For the purposes of this vignette we will first create a simulated dataset with the rbmi
function simulate_data()
.
The simulate_data()
function generates data from a randomized clinical trial with longitudinal continuous outcomes and up to two
different types of ICEs.
Specifically, we simulate a 1:1 randomized trial of an active drug (intervention) versus placebo (control) with 100 subjects per group and 4 post-baseline assessments (3-monthly visits until 12 months):
The function simulate_data()
requires 3 arguments (see the function documentation help(simulate_data)
for more details):
pars_c
: The simulation parameters of the control group.pars_t
: The simulation parameters of the intervention group.post_ice1_traj
: Specifies how observed outcomes after ICE1 are simulated.Below, we report how data according to the specifications above can be simulated with function simulate_data()
:
library(rbmi) library(dplyr) set.seed(1392) time <- c(0, 3, 6, 9, 12) # Mean trajectory control muC <- c(50.0, 52.5, 55.0, 57.5, 60.0) # Mean trajectory intervention muT <- c(50.0, 52.5, 55.0, 56.25, 57.50) # Create Sigma sd_error <- 2.5 covRE <- rbind( c(25.0, 6.25), c(6.25, 25.0) ) Sigma <- cbind(1, time / 12) %*% covRE %*% rbind(1, time / 12) + diag(sd_error^2, nrow = length(time)) # Set simulation parameters of the control group parsC <- set_simul_pars( mu = muC, sigma = Sigma, n = 100, # sample size prob_ice1 = 0.03, # prob of discontinuation for outcome equal to 50 or_outcome_ice1 = 1.10, # +1 point increase => +10% odds of discontinuation prob_post_ice1_dropout = 0.5 # dropout rate following discontinuation ) # Set simulation parameters of the intervention group parsT <- parsC parsT$mu <- muT parsT$prob_ice1 <- 0.04 # Simulate data data <- simulate_data( pars_c = parsC, pars_t = parsT, post_ice1_traj = "CIR" # Assumption about post-ice trajectory ) %>% select(-c(outcome_noICE, ind_ice2)) # remove unncessary columns head(data)
The frequency of the ICE and proportion of data collected after the ICE impacts the variance of the treatment effect for retrieved dropout models. For example, a large proportion of ICE combined with a small proportion of data collected after the ICE might result in substantial variance inflation, especially for more complex retrieved dropout models.
The proportion of subjects with an ICE and the proportion of subjects who withdrew from the simulated study is summarized below:
# Compute endpoint of interest: change from baseline data <- data %>% filter(visit != "0") %>% mutate( change = outcome - outcome_bl, visit = factor(visit, levels = unique(visit)) ) data %>% group_by(visit) %>% summarise( freq_disc_ctrl = mean(ind_ice1[group == "Control"] == 1), freq_dropout_ctrl = mean(dropout_ice1[group == "Control"] == 1), freq_disc_interv = mean(ind_ice1[group == "Intervention"] == 1), freq_dropout_interv = mean(dropout_ice1[group == "Intervention"] == 1) )
For this study 23\% of the study participants discontinued from study treatment in the control arm and 24\% in the intervention arm. Approximately half of the participants who discontinued from treatment dropped-out from the study at the discontinuation visit leading to missing outcomes at subsequent visits.
We consider retrieved dropout methods which model pre- and post-ICE outcomes jointly by including time-varying ICE indicators in the imputation model, i.e. we allow the occurrence of the ICE to impact the mean structure but not the covariance matrix. Imputation of missing outcomes is then performed under a MAR assumption including all observed data. For the analysis of the completed data, we use a standard ANCOVA model of the outcome at each follow-up visit, respectively, with treatment assignment as the main covariate and adjustment for the baseline outcome.
Specifically, we consider the following imputation models:
rbmi
is not based on sequential imputation but
rather, all missing outcomes are imputed simultaneously based on a MMRM-type imputation model. We include baseline outcome by
visit and treatment group by visit interaction terms in the imputation model which is of the form:
change ~ outcome_bl*visit + group*visit
.change ~ outcome_bl*visit + group*visit + time_since_ice1*group
, where time_since_ice1
is set to 0 up to
the treatment discontinuation and to the time from treatment
discontinuation (in months) at subsequent visits. This implies a change in the slope of the outcome trajectories after the ICE, which
is modeled separately for each treatment arm. This model is similar to the "TV2-MAR" estimator in @Noci2021. Compared to the basic
MAR model, this model requires estimation of 2 additional parameters.change ~ outcome_bl*visit + group*visit +
ind_ice1*group*visit
. This assumes a constant shift in outcomes after the ICE, which is modeled separately for each treatment arm
and each visit. This model is analogous to the "MI2" model in @Bell2024. Compared to the basic MAR model, this model requires
estimation of 2 times "number of visits" additional parameters. It makes different though rather weaker assumptions than the RD1
model but might also be harder to fit if post-ICE data collection is sparse at some visits. rbmi
rbmi
supports the inclusion of time-varying covariates in the imputation model. The only requirement is that the time-varying
covariate is non-missing at all visits including those where the outcome might be missing.
Imputation is performed under a (extended) MAR assumption. Therefore, all imputation approaches implemented in rbmi
are valid and
should yield comparable estimators and standard errors.
For this vignette, we used the conditional mean imputation approach combined with the jackknife.
# Define key variables for the imputation and analysis models vars <- set_vars( subjid = "id", visit = "visit", outcome = "change", group = "group", covariates = c("outcome_bl*visit", "group*visit") ) vars_an <- vars vars_an$covariates <- "outcome_bl" # Define imputation method method <- method_condmean(type = "jackknife") draw_obj <- draws( data = data, data_ice = NULL, vars = vars, method = method, quiet = TRUE ) impute_obj <- impute( draw_obj ) ana_obj <- analyse( impute_obj, vars = vars_an ) pool_obj_basicMAR <- pool(ana_obj) pool_obj_basicMAR
# derive variable "time_since_ice1" (time since ICE in months) data <- data %>% group_by(id) %>% mutate(time_since_ice1 = cumsum(ind_ice1)*3) vars$covariates <- c("outcome_bl*visit", "group*visit", "time_since_ice1*group") draw_obj <- draws( data = data, data_ice = NULL, vars = vars, method = method, quiet = TRUE ) impute_obj <- impute( draw_obj ) ana_obj <- analyse( impute_obj, vars = vars_an ) pool_obj_RD1 <- pool(ana_obj) pool_obj_RD1
vars$covariates <- c("outcome_bl*visit", "group*visit", "ind_ice1*group*visit") draw_obj <- draws( data = data, data_ice = NULL, vars = vars, method = method, quiet = TRUE ) impute_obj <- impute( draw_obj ) ana_obj <- analyse( impute_obj, vars = vars_an ) pool_obj_RD2 <- pool(ana_obj) pool_obj_RD2
The point estimators of the treatment effect at the last visit were
r formatC(pool_obj_basicMAR$pars$trt_4$est,format="f",digits=3)
,
r formatC(pool_obj_RD1$pars$trt_4$est,format="f",digits=3)
, and
r formatC(pool_obj_RD2$pars$trt_4$est,format="f",digits=3)
for the basic MAR, RD1, and RD2 estimators, respectively, i.e. slightly
smaller for the retrieved dropout models compared to the basic MAR model.
The corresponding standard errors of the 3 estimators were
r formatC(pool_obj_basicMAR$pars$trt_4$se,format="f",digits=3)
,
r formatC(pool_obj_RD1$pars$trt_4$se,format="f",digits=3)
, and
r formatC(pool_obj_RD2$pars$trt_4$se,format="f",digits=3)
, i.e. slightly larger for the retrieved dropout models compared to the
basic MAR model.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.