knitr::opts_chunk$set(echo = TRUE)

This vignette describes how retrieved dropout models which include time-varying intercurrent event (ICE) indicators can be implemented in the rbmi package.

Retrieved dropout models in a nutshell

Retrieved dropout models have been proposed for the analysis of estimands using the treatment policy strategy for addressing an ICE. In these models, missing outcomes are multiply imputed conditional upon whether they occur pre- or post-ICE. Retrieved dropout models typically rely on an extended missing-at-random (MAR) assumption, i.e., they assume that missing outcome data is similar to observed data from subjects in the same treatment group with the same observed outcome history, and the same ICE status. For a more comprehensive description and evaluation of retrieved dropout models, we refer to @Guizzaro2021, @PolverejanDragalin2020, @Noci2021, @Drury2024, and @Bell2024. Broadly, these publications find that retrieved dropout models reduce bias compared to alternative analysis approaches based on imputation under a basic MAR assumption or a reference-based missing data assumption. However, several issues of retrieved dropout models have also been highlighted. Retrieved dropout models require that enough post-ICE data is collected to inform the imputation model. Even with relatively small amounts of missingness, complex retrieved dropout models may face identifiability issues. Another drawback to these models in general is the loss of power relative to reference-based imputation methods, which becomes meaningful for post-ICE observation percentages below 50% and increases at an accelerating rate as this percentage decreases [@Bell2024].

Data simulation using function simulate_data() {#sec:dataSimul}

For the purposes of this vignette we will first create a simulated dataset with the rbmi function simulate_data(). The simulate_data() function generates data from a randomized clinical trial with longitudinal continuous outcomes and up to two different types of ICEs.

Specifically, we simulate a 1:1 randomized trial of an active drug (intervention) versus placebo (control) with 100 subjects per group and 4 post-baseline assessments (3-monthly visits until 12 months):

The function simulate_data() requires 3 arguments (see the function documentation help(simulate_data) for more details):

Below, we report how data according to the specifications above can be simulated with function simulate_data():

library(rbmi)
library(dplyr)

set.seed(1392)

time <- c(0, 3, 6, 9, 12)

# Mean trajectory control
muC <- c(50.0, 52.5, 55.0, 57.5, 60.0)

# Mean trajectory intervention
muT <- c(50.0, 52.5, 55.0, 56.25, 57.50)

# Create Sigma
sd_error <- 2.5
covRE <- rbind(
  c(25.0, 6.25),
  c(6.25, 25.0)
)

Sigma <- cbind(1, time / 12) %*%
    covRE %*% rbind(1, time / 12) +
    diag(sd_error^2, nrow = length(time))

# Set simulation parameters of the control group
parsC <- set_simul_pars(
    mu = muC,
    sigma = Sigma,
    n = 100, # sample size
    prob_ice1 = 0.03, # prob of discontinuation for outcome equal to 50
    or_outcome_ice1 = 1.10,  # +1 point increase => +10% odds of discontinuation
    prob_post_ice1_dropout = 0.5 # dropout rate following discontinuation
)

# Set simulation parameters of the intervention group
parsT <- parsC
parsT$mu <- muT
parsT$prob_ice1 <- 0.04

# Simulate data
data <- simulate_data(
    pars_c = parsC,
    pars_t = parsT,
    post_ice1_traj = "CIR" # Assumption about post-ice trajectory
) %>%
  select(-c(outcome_noICE, ind_ice2)) # remove unncessary columns


head(data)

The frequency of the ICE and proportion of data collected after the ICE impacts the variance of the treatment effect for retrieved dropout models. For example, a large proportion of ICE combined with a small proportion of data collected after the ICE might result in substantial variance inflation, especially for more complex retrieved dropout models.

The proportion of subjects with an ICE and the proportion of subjects who withdrew from the simulated study is summarized below:

# Compute endpoint of interest: change from baseline
data <- data %>% 
  filter(visit != "0") %>%
  mutate(
    change = outcome - outcome_bl,
    visit = factor(visit, levels = unique(visit))
  )


data %>%
  group_by(visit) %>% 
  summarise(
    freq_disc_ctrl = mean(ind_ice1[group == "Control"] == 1),
    freq_dropout_ctrl = mean(dropout_ice1[group == "Control"] == 1),
    freq_disc_interv = mean(ind_ice1[group == "Intervention"] == 1),
    freq_dropout_interv = mean(dropout_ice1[group == "Intervention"] == 1)
  )

For this study 23\% of the study participants discontinued from study treatment in the control arm and 24\% in the intervention arm. Approximately half of the participants who discontinued from treatment dropped-out from the study at the discontinuation visit leading to missing outcomes at subsequent visits.

Estimators based on retrieved dropout models

We consider retrieved dropout methods which model pre- and post-ICE outcomes jointly by including time-varying ICE indicators in the imputation model, i.e. we allow the occurrence of the ICE to impact the mean structure but not the covariance matrix. Imputation of missing outcomes is then performed under a MAR assumption including all observed data. For the analysis of the completed data, we use a standard ANCOVA model of the outcome at each follow-up visit, respectively, with treatment assignment as the main covariate and adjustment for the baseline outcome.

Specifically, we consider the following imputation models:

Implementation of the defined retrieved dropout models in rbmi

rbmi supports the inclusion of time-varying covariates in the imputation model. The only requirement is that the time-varying covariate is non-missing at all visits including those where the outcome might be missing. Imputation is performed under a (extended) MAR assumption. Therefore, all imputation approaches implemented in rbmi are valid and should yield comparable estimators and standard errors. For this vignette, we used the conditional mean imputation approach combined with the jackknife.

Basic MAR model

# Define key variables for the imputation and analysis models
vars <- set_vars(
  subjid = "id",
  visit = "visit",
  outcome = "change",
  group = "group",
  covariates = c("outcome_bl*visit", "group*visit")
)

vars_an <- vars
vars_an$covariates <- "outcome_bl"

# Define imputation method
method <- method_condmean(type = "jackknife")

draw_obj <- draws(
  data = data,
  data_ice = NULL,
  vars = vars,
  method = method,
  quiet = TRUE
)

impute_obj <- impute(
  draw_obj
)

ana_obj <- analyse(
  impute_obj,
  vars = vars_an
)

pool_obj_basicMAR <- pool(ana_obj)
pool_obj_basicMAR

Retrieved dropout model 1 (RD1)

# derive variable "time_since_ice1" (time since ICE in months)
data <- data %>% 
  group_by(id) %>% 
  mutate(time_since_ice1 = cumsum(ind_ice1)*3)

vars$covariates <- c("outcome_bl*visit", "group*visit", "time_since_ice1*group")

draw_obj <- draws(
  data = data,
  data_ice = NULL,
  vars = vars,
  method = method,
  quiet = TRUE
)

impute_obj <- impute(
  draw_obj
)

ana_obj <- analyse(
  impute_obj,
  vars = vars_an
)

pool_obj_RD1 <- pool(ana_obj)
pool_obj_RD1

Retrieved dropout model 2 (RD2)

vars$covariates <- c("outcome_bl*visit", "group*visit", "ind_ice1*group*visit")

draw_obj <- draws(
  data = data,
  data_ice = NULL,
  vars = vars,
  method = method,
  quiet = TRUE
)

impute_obj <- impute(
  draw_obj
)

ana_obj <- analyse(
  impute_obj,
  vars = vars_an
)

pool_obj_RD2 <- pool(ana_obj)
pool_obj_RD2

Brief summary of results

The point estimators of the treatment effect at the last visit were r formatC(pool_obj_basicMAR$pars$trt_4$est,format="f",digits=3), r formatC(pool_obj_RD1$pars$trt_4$est,format="f",digits=3), and r formatC(pool_obj_RD2$pars$trt_4$est,format="f",digits=3) for the basic MAR, RD1, and RD2 estimators, respectively, i.e. slightly smaller for the retrieved dropout models compared to the basic MAR model. The corresponding standard errors of the 3 estimators were r formatC(pool_obj_basicMAR$pars$trt_4$se,format="f",digits=3), r formatC(pool_obj_RD1$pars$trt_4$se,format="f",digits=3), and r formatC(pool_obj_RD2$pars$trt_4$se,format="f",digits=3), i.e. slightly larger for the retrieved dropout models compared to the basic MAR model.

References {.unlisted .unnumbered}



insightsengineering/rbmi documentation built on Feb. 28, 2025, 3:34 a.m.