Time-Varying Causal Excursion Effect Mediation in MRT: Continuous Distal Outcomes

knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
options(digits = 4)

library(MRTAnalysis)

The MRTAnalysis package now supports analysis of mediated causal excursion effect of time-varying treatments, time-varying mediators, and a continuous distal outcome in micro-randomized trials (MRTs), using the function mcee().
Distal outcomes are measured once at the end of the study (e.g., weight loss, cognitive score), in contrast to proximal outcomes which are repeatedly measured after each treatment decision point. Time-varying mediators can often be the proximal outcomes in MRTs.

1. Introduction

Micro-randomized trials (MRTs) are experimental designs to evaluate and optimize quickly-adapting interventions, where each participant is randomized among treatment options many times throughout the study. This vignette shows how to use the mcee (stands for "mediated causal excursion effect") function to assess how treatment effects on a distal outcome are mediated through intermediate variables (e.g., the activity prompt -> near-term activity -> long-term cardiovascular health pathway).

This package implements estimation of the natural direct excursion effect (NDEE) and the natural indirect excursion effect (NIEE) in MRTs and related longitudinal designs:

The modeling of the direct and indirect effects is allowed to vary with the decision point index, specified through the argument time_varying_effect_form. For example, one may model NIEE and NDEE as constants over time (using ~1) or as quadratic functions of the decision point (using ~ dp + I(dp^2)). The parameter that parameterizes NIEE is denoted by $\beta$ and the one for NDEE is $\alpha$. At present, the software does not support mediation effects conditional on baseline or other time-varying covariates. For full details of the estimands, see @qian2025dynamic.

Estimation framework

For applied users: if analyzing an MRT, you typically know the randomization probability, and need to provide a control_formula_with_mediator describing covariates and mediators to include in nuisance models. By default, all nuisance regressions are estimated with GLMs, but you can substitute other machine learning methods.

For statistically inclined readers: the estimator is multiply robust and generalizes the semiparametric estimators of Tchetgen Tchetgen & Shpitser (2012) to longitudinal settings. It's a two-stage estimation process, where the first stage fits five nuisance functions and the second stage estimates the parameterized NIEE and NDEE. The five nuisance functions are (see @qian2025dynamic for the precise definition):

Three user-facing wrappers

The package provides three entry points depending on how much control you want:

The remainder of this vignette will first focus on mcee for a quick introduction, then show how to use the more flexible wrappers.


2. Installation

You can install the MRTAnalysis package from CRAN:

# install.packages("MRTAnalysis")

3. Quick Start Example

We illustrate the most basic usage of mcee on a small simulated dataset. Note that the dataset is in long format, one row per subject-by-decision point. The distal outcome is repeated on each row as a constant within subject.

set.seed(123)

# Simulate a toy dataset
n <- 20
T_val <- 5
id <- rep(1:n, each = T_val)
dp <- rep(1:T_val, times = n)
A <- rbinom(n * T_val, 1, 0.5)
M <- rbinom(n * T_val, 1, plogis(-0.2 + 0.3 * A + 0.1 * dp))
Y <- ave(0.5 * A + 0.7 * M + 0.2 * dp + rnorm(n * T_val), id) # constant within id

dat <- data.frame(id, dp, A, M, Y)

# Minimal mcee call
fit <- mcee(
    data = dat,
    id = "id", dp = "dp",
    outcome = "Y", treatment = "A", mediator = "M",
    time_varying_effect_form = ~1, # constant-over-time NDEE and NIEE
    control_formula_with_mediator = ~ dp + M, # nuisance adjustment
    control_reg_method = "glm", # default method
    rand_prob = 0.5, # known randomization prob
    verbose = FALSE
)

summary(fit)

The summary output provides estimates of Natural Direct Excursion Effect (NDEE; $\alpha$) and Natural Indirect Excursion Effect (NIEE; $\beta$), with standard errors, 95\% confidence intervals, and p-values. The only row in the output Natural Direct Excursion Effect (alpha) and Natural Indirect Excursion Effect (beta) is named Intercept, indicating that these effects are modeled as constants over time (like intercepts in the effect models). In particular, the estimated NDEE is 0.17 (95\% CI: -0.08, 0.42; p-value: 0.17) and the estimated NIEE is 0.025 (95\% CI: -0.002, 0.054; p-value: 0.06). The confidence intervals and p-values are based on t-quantiles.

4. A More Complex Data Example

This package ships a small example dataset, data_time_varying_mediator_distal_outcome. It is in long format, one row per subject-by-decision point.

Columns (Variables)

Peek at the included dataset

data(data_time_varying_mediator_distal_outcome)

dat <- data_time_varying_mediator_distal_outcome

dplyr::glimpse(dat)

dplyr::count(dat, id, name = "Ti") |>
    dplyr::summarise(mean_T = mean(Ti), min_T = min(Ti), max_T = max(Ti))

# Delete some decision points for certain individuals to mimic scenarios
# where not all individuals have the same number of decision points.
dat <- dat[!((dat$id == 1 & dat$dp == 10) | (dat$id == 2 & dat$dp %in% c(9, 10))), ]
dplyr::count(dat, id, name = "Ti") |>
    dplyr::summarise(mean_T = mean(Ti), min_T = min(Ti), max_T = max(Ti))

5. Detailed Walkthrough of mcee

We focus first on the streamlined MRT workflow with known randomization probability. Throughout, we'll use the included dataset.

5.1 Basic usage (GLM; constant effects)

set.seed(1)
fit1 <- mcee(
    data = dat,
    id = "id",
    dp = "dp",
    outcome = "Y",
    treatment = "A",
    mediator = "M",
    availability = "I",
    rand_prob = "p_A",
    time_varying_effect_form = ~1, # NDEE and NIEE are constant over time
    control_formula_with_mediator = ~ dp + M + X, # covariate adjustment
    control_reg_method = "glm", # nuisance learners for q, eta, mu, nu
    verbose = FALSE
)
summary(fit1)

5.2 Time-varying effects (linear in dp)

fit2 <- mcee(
    data = dat,
    id = "id", dp = "dp", outcome = "Y", treatment = "A", mediator = "M",
    availability = "I", rand_prob = "p_A",
    time_varying_effect_form = ~dp, # NDEE, NIEE vary linearly in time
    control_formula_with_mediator = ~ dp + M + X,
    control_reg_method = "glm",
    verbose = FALSE
)
summary(fit2) # rows now labeled (Intercept) and dp

Interpretation: the rows for dp report how the natural direct/indirect excursion effects change per unit increase in decision point.

Tip. If your dataset already contains a basis of time (e.g., spline columns), you can reference them in time_varying_effect_form. The package will warn when variables other than dp appear there, but this is allowed for precomputed time bases.

5.3 Other learners for nuisance fitting

You can switch the learner used for fitting the nuisance functions via control_reg_method:

# Example: GAM (generalized additive model)
set.seed(2)
fit3 <- mcee(
    data = dat,
    id = "id", dp = "dp", outcome = "Y", treatment = "A", mediator = "M",
    availability = "I", rand_prob = "p_A",
    time_varying_effect_form = ~dp,
    control_formula_with_mediator = ~ s(dp) + s(M) + s(X), # spline formula for mgcv::gam
    control_reg_method = "gam",
    verbose = FALSE
)
summary(fit3)

5.4 Estimating effects at specific decision points

If interested in estimating direct and indirect effects of intervention at a specific decision point (or a few decision points), you can use the specific_dp_only argument. For example, to estimate effects at the first two decision points (1 and 2) only, you can either:

fit4 <- mcee(
    data = dat,
    id = "id", dp = "dp", outcome = "Y", treatment = "A", mediator = "M",
    availability = "I", rand_prob = "p_A",
    time_varying_effect_form = ~1,
    control_formula_with_mediator = ~ dp + M + X,
    control_reg_method = "glm",
    specific_dp_only = c(1, 2),
    verbose = FALSE
)
summary(fit4)

The effect estimates are now an average over decision points 1 and 2.

In the next section of the vignette, we'll show how to use the more flexible wrappers, mcee_general and mcee_userfit_nuisance.

6. Testing for Linear Combinations of Coefficients

Every wrapper in this package returns an object of class mcee_fit. This object is a list with several components.

6.1 Estimates $mcee_fit

fit1$mcee_fit

6.2 Linear combinations

The lincomb_joint, lincomb_alpha, and lincomb_beta arguments to summary() lets you compute linear combinations of the estimated coefficients. For example:

Example 1: Difference $\alpha - \beta$ for constant‑effect model (fit1)

# difference between direct and indirect excursion effects
summary(fit1, lincomb_joint = c(1, -1))

This requests the linear combination $\alpha - \beta$, with standard error and t-statistic computed from the covariance matrix.

Example 2: Effects at decision point 10 for linear‑trend model (fit2)

Suppose you fit a model with a linear time basis, e.g.:

fit2 <- mcee(
    data = dat,
    id = "id", dp = "dp", outcome = "Y", treatment = "A", mediator = "M",
    availability = "I", rand_prob = "p_A",
    time_varying_effect_form = ~dp,
    control_formula_with_mediator = ~ dp + M + X,
    control_reg_method = "glm",
    verbose = FALSE
)

To get the effect at decision point 10 (the last decision point), you need the intercept plus 9 times the slope:

summary(fit2, lincomb_alpha = c(1, 9), lincomb_beta = c(1, 9))

If you prefer a joint contrast (e.g., alpha(t=10) − beta(t=10)), stack the two contrasts into a single row over c(alpha, beta):

L_joint_t10 <- matrix(c(
    1, 9, # alpha part
    -1, -9 # beta part
), nrow = 1)
summary(fit2, lincomb_joint = L_joint_t10)

6.3 Inspecting nuisance models

For diagnostics, the returned object also includes:

One can use summary(..., show_nuisance = TRUE) to get a compact summary of the nuisance fits.

summary(fit1, show_nuisance = TRUE)

head(fit1$nuisance_fitted$mu1)

Tip. Because varcov is the joint covariance of c(alpha, beta), you can build any custom Wald test by forming your own contrast matrix L and using L %*% c(alpha, beta) with variance L %*% varcov %*% t(L). This is equivalent to using lincomb_joint = L in summary().

7. Other Wrappers: mcee_general and mcee_userfit_nuisance

While mcee covers the common MRT use case (known randomization, one control formula for all nuisance functions), two additional wrappers provide more flexibility in controlling how the nuisance functions are fitted:

7.1 mcee_general

This interface lets you configure each nuisance function separately.

Helper functions can be used to build these objects: mcee_config_maker, mcee_config_known, mcee_config_glm, mcee_config_gam, mcee_config_lm, mcee_config_rf, mcee_config_ranger, mcee_config_sl, mcee_config_sl_user.

Example: flexible nuisance configs

# Families (binomial vs Gaussian) are chosen automatically when omitted; for `p` and `q` the default is binomial, for the outcome regressions it is Gaussian.

cfg <- list(
    p   = mcee_config_known("p", dat$p_A), # known randomization prob in MRT
    q   = mcee_config_glm("q", ~ dp + X + M),
    eta = mcee_config_glm("eta", ~ dp + X),
    mu  = mcee_config_glm("mu", ~ dp + X + M),
    nu  = mcee_config_glm("nu", ~ dp + X)
)

fit_gen <- mcee_general(
    data = dat,
    id = "id", dp = "dp", outcome = "Y",
    treatment = "A", mediator = "M",
    availability = "I",
    time_varying_effect_form = ~dp,
    config_p = cfg$p, config_q = cfg$q,
    config_eta = cfg$eta, config_mu = cfg$mu, config_nu = cfg$nu,
    verbose = FALSE
)
summary(fit_gen)

Tip. This interface is particularly useful for observational studies where the treatment mechanism p is unknown and must be estimated. You can set config_p to a regression formula and method of your choice.

7.2 mcee_userfit_nuisance

Sometimes you fit nuisance models externally, perhaps with custom ML workflows. In that case, you can bypass Stage‑1 nuisance model fitting and supply predictions directly.

Example: manual nuisance fits with GLM

# Fit nuisance regressions manually: p, q, eta, mu, nu
p1_hat <- dat$p_A # known randomization prob in MRT
p1_hat[dat$I == 0] <- 1 # manually set this to avoid wrapper message
q1_hat <- predict(glm(A ~ dp + X + M, family = binomial(), data = dat[dat$I == 1, ]),
    type = "response", newdata = dat
)
q1_hat[dat$I == 0] <- 1 # manually set this to avoid wrapper message
eta1_hat <- predict(lm(Y ~ dp + X, data = dat[dat$A == dat$I, ]), newdata = dat)
eta0_hat <- predict(lm(Y ~ dp + X, data = dat[dat$A == 0, ]), newdata = dat)
mu1_hat <- predict(lm(Y ~ dp + X + M, data = dat[dat$A == dat$I, ]), newdata = dat)
mu0_hat <- predict(lm(Y ~ dp + X + M, data = dat[dat$A == 0, ]), newdata = dat)
nu1_hat <- predict(lm(mu1 ~ dp + X, data = cbind(dat, mu1 = mu1_hat)[dat$A == 0, ]), newdata = dat)
nu0_hat <- predict(lm(mu0 ~ dp + X, data = cbind(dat, mu0 = mu0_hat)[dat$A == dat$I, ]), newdata = dat)

fit_usr <- mcee_userfit_nuisance(
    data = dat,
    id = "id", dp = "dp", outcome = "Y",
    treatment = "A", mediator = "M",
    availability = "I",
    time_varying_effect_form = ~dp,
    p1 = p1_hat,
    q1 = q1_hat,
    eta1 = eta1_hat, eta0 = eta0_hat,
    mu1 = mu1_hat, mu0 = mu0_hat,
    nu1 = nu1_hat, nu0 = nu0_hat,
    verbose = FALSE
)
summary(fit_usr)

7.3 Comparing results

Different wrappers will produce the exact same result if the same nuisance models are used. Here we compare fit2 (from mcee) and fit_gen (from mcee_general) which use the same GLM nuisance models and the same control variables.

fit2$mcee_fit[c("alpha_hat", "beta_hat", "alpha_se", "beta_se")]
fit_gen$mcee_fit[c("alpha_hat", "beta_hat", "alpha_se", "beta_se")]

all.equal(fit2$mcee_fit, fit_gen$mcee_fit, tolerance = 1e-6)
all.equal(fit2$mcee_fit, fit_usr$mcee_fit, tolerance = 1e-6)

8. Best Practices


9. Conclusion

The mcee package provides a multiply robust estimator of natural direct and indirect excursion effects in micro-randomized trials and observational longitudinal studies with time-varying treatments, time-varying mediators and a distal outcome. Three wrapper functions serve different needs:

Together, these tools allow both applied researchers and methodologists to estimate mediated excursion effects with transparent assumptions and flexible modeling options.

Appendix: Estimating Equations and Asymptotic Variance

This appendix documents how the core calculation happens in .mcee_core_rows(), which is a detailed derivation of Eq. (12) and the asymptotic variance of Theorem 4 in @qian2025dynamic.

The proposed estimating function is

[ \psi(\gamma; \zeta) := \sum_{t=1}^T \omega(t) \begin{bmatrix} { \phi_t^{10}(p_t, q_t, \mu_t, \nu_t) - \phi_t^{00}(p_t, \eta_t) - f(t)^\top \alpha } f(t) \ { \phi_t^{11}(p_t, \eta_t) - \phi_t^{10}(p_t, q_t, \mu_t, \nu_t) - f(t)^\top \beta } f(t) \end{bmatrix}. ]

The estimating equation (EE), omitting the nuisance parameters and allowing different (T_i) for each subject (i), is

[ \frac{1}{n}\sum_{i=1}^n \sum_{t=1}^{T_i} \omega(i,t) \begin{bmatrix} { \phi_{it}^{10} - \phi_{it}^{00} - f(t)^\top \alpha } f(t) \ { \phi_{it}^{11} - \phi_{it}^{10} - f(t)^\top \beta } f(t) \end{bmatrix} = 0. ]

The estimators are computed by

[ \hat\alpha = \left[ \frac{1}{n}\sum_{i=1}^n \sum_{t=1}^{T_i} \omega(i,t) f(t)f(t)^\top \right]^{-1} \left[ \frac{1}{n}\sum_{i=1}^n \sum_{t=1}^{T_i} \omega(i,t) { \phi_{it}^{10} - \phi_{it}^{00} } f(t) \right], ]

[ \hat\beta = \left[ \frac{1}{n}\sum_{i=1}^n \sum_{t=1}^{T_i} \omega(i,t) f(t)f(t)^\top \right]^{-1} \left[ \frac{1}{n}\sum_{i=1}^n \sum_{t=1}^{T_i} \omega(i,t) { \phi_{it}^{11} - \phi_{it}^{10} } f(t) \right]. ]

The asymptotic variance of ((\alpha^\top, \beta^\top)^\top) can be estimated by

[ \text{Var}(\hat\gamma) \approx \text{Bread}^{-1} \, \text{Meat} \, (\text{Bread}^{-1})^\top, ]

where

[ \text{Bread} = \frac{1}{n}\sum_{i=1}^n \sum_{t=1}^{T_i} \omega(i,t) \begin{bmatrix} f(t)f(t)^\top & 0 \ 0 & f(t)f(t)^\top \end{bmatrix}, ]

[ \text{Meat} = \frac{1}{n}\sum_{i=1}^n \left( \sum_{t=1}^{T_i} \omega(i,t) \begin{bmatrix} { \phi_{it}^{10} - \phi_{it}^{00} - f(t)^\top \alpha } f(t) \ { \phi_{it}^{11} - \phi_{it}^{10} - f(t)^\top \beta } f(t) \end{bmatrix} \right)^{\otimes 2}. ]

Reference



Try the MRTAnalysis package in your browser

Any scripts or data that you put into this service are public.

MRTAnalysis documentation built on Sept. 9, 2025, 5:41 p.m.