lmtp_sub: LMTP Substitution Estimator

View source: R/estimators.R

lmtp_subR Documentation

LMTP Substitution Estimator

Description

G-computation estimator for the effects of traditional causal effects and modified treatment policies for both point treatment and longitudinal data with binary, continuous, or time-to-event outcomes. Supports binary, categorical, and continuous exposures.

Usage

lmtp_sub(
  data,
  trt,
  outcome,
  baseline = NULL,
  time_vary = NULL,
  cens = NULL,
  shift = NULL,
  shifted = NULL,
  k = Inf,
  outcome_type = c("binomial", "continuous", "survival"),
  id = NULL,
  bounds = NULL,
  learners = "SL.glm",
  folds = 10,
  weights = NULL,
  .bound = 1e-05,
  .learners_folds = 10,
  .return_full_fits = FALSE
)

Arguments

data

[data.frame]
A data.frame in wide format containing all necessary variables for the estimation problem. Must not be a data.table.

trt

[character]
A vector containing the column names of treatment variables ordered by time.

outcome

[character]
The column name of the outcome variable. In the case of time-to-event analysis, a vector containing the columns names of intermediate outcome variables and the final outcome variable ordered by time. Only numeric values are allowed. If the outcome type is binary, data should be coded as 0 and 1.

baseline

[character]
An optional vector containing the column names of baseline covariates to be included for adjustment at every time point.

time_vary

[list]
A list the same length as the number of time points of observation with the column names for new time-varying covariates introduced at each time point. The list should be ordered following the time ordering of the model.

cens

[character]
An optional vector of column names of censoring indicators the same length as the number of time points of observation. If missingness in the outcome is present or if time-to-event outcome, must be provided.

shift

[closure]
A two argument function that specifies how treatment variables should be shifted. See examples for how to specify shift functions for continuous, binary, and categorical exposures.

shifted

[data.frame]
An optional data frame, the same as in data, but modified according to the treatment policy of interest. If specified, shift is ignored.

k

[integer(1)]
An integer specifying how previous time points should be used for estimation at the given time point. Default is Inf, all time points.

outcome_type

[character(1)]
Outcome variable type (i.e., continuous, binomial, survival).

id

[character(1)]
An optional column name containing cluster level identifiers.

bounds

[numeric(2)]
An optional, ordered vector of the bounds for a continuous outcomes. If NULL, the bounds will be taken as the minimum and maximum of the observed data. Should be left as NULL if the outcome type is binary.

learners

[character]
A vector of SuperLearner algorithms for estimation of the outcome regression. Default is "SL.glm", a main effects GLM.

folds

[integer(1)]
The number of folds to be used for cross-fitting.

weights

[numeric(nrow(data))]
An optional vector containing sampling weights.

.bound

[numeric(1)]
Determines that maximum and minimum values (scaled) predictions will be bounded by. The default is 1e-5, bounding predictions by 1e-5 and 0.9999.

.learners_folds

[integer(1)]
The number of cross-validation folds for learners.

.return_full_fits

[logical(1)]
Return full SuperLearner fits? Default is FALSE, return only SuperLearner weights.

Value

A list of class lmtp containing the following components:

estimator

The estimator used, in this case "substitution".

theta

The estimated population LMTP effect.

standard_error

NA

low

NA

high

NA

shift

The shift function specifying the treatment policy of interest.

outcome_reg

An n x Tau + 1 matrix of outcome regression predictions. The mean of the first column is used for calculating theta.

fits_m

A list the same length as folds, containing the fits at each time-point for each fold for the outcome regression.

outcome_type

The outcome variable type.

Examples


  set.seed(56)
  n <- 1000
  W <- rnorm(n, 10, 5)
  A <- 23 + 5*W + rnorm(n)
  Y <- 7.2*A + 3*W + rnorm(n)
  ex1_dat <- data.frame(W, A, Y)

  # Example 1.1
  # Point treatment, continuous exposure, continuous outcome, no loss-to-follow-up
  # Interested in the effect of a modified treatment policy where A is decreased by 15
  # units only among observations whose observed A was above 80.
  # The true value under this intervention is about 513.
  policy <- function(data, x) {
    (data[[x]] > 80)*(data[[x]] - 15) + (data[[x]] <= 80)*data[[x]]
  }

  lmtp_sub(ex1_dat, "A", "Y", "W", shift = policy,
           outcome_type = "continuous", folds = 2)

  # Example 2.1
  # Longitudinal setting, time-varying continuous exposure bounded by 0,
  # time-varying covariates, and a binary outcome with no loss-to-follow-up.
  # Interested in the effect of a treatment policy where exposure decreases by
  # one unit at every time point if an observations observed exposure is greater
  # than or equal to 2. The true value under this intervention is about 0.305.
  head(sim_t4)

  A <- c("A_1", "A_2", "A_3", "A_4")
  L <- list(c("L_1"), c("L_2"), c("L_3"), c("L_4"))

  policy <- function(data, trt) {
    a <- data[[trt]]
    (a - 1) * (a - 1 >= 1) + a * (a - 1 < 1)
  }

  # BONUS: progressr progress bars!
  progressr::handlers(global = TRUE)

  lmtp_sub(sim_t4, A, "Y", time_vary = L, shift = policy, folds = 2)

  # Example 2.2
  # The previous example assumed that the outcome (as well as the treatment variables)
  # were directly affected by all other nodes in the past. In certain situations,
  # domain specific knowledge may suggest otherwise.
  # This can be controlled using the k argument.
  lmtp_sub(sim_t4, A, "Y", time_vary = L, shift = policy, k = 0, folds = 2)

  # Example 2.3
  # Using the same data as examples 2.1 and 2.2.
  # Now estimating the effect of a dynamic modified treatment policy.

  # creating a dynamic mtp that applies the shift function
  # but also depends on history and the current time
  policy <- function(data, trt) {
    mtp <- function(data, trt) {
      (data[[trt]] - 1) * (data[[trt]] - 1 >= 1) + data[[trt]] * (data[[trt]] - 1 < 1)
    }

    # if its the first time point, follow the same mtp as before
    if (trt == "A_1") return(mtp(data, trt))

    # otherwise check if the time varying covariate equals 1
    ifelse(
      data[[sub("A", "L", trt)]] == 1,
      mtp(data, trt), # if yes continue with the policy
      data[[trt]]     # otherwise do nothing
    )
  }

  lmtp_sub(sim_t4, A, "Y", time_vary = L, k = 0, shift = policy, folds = 2)

  # Example 2.4
  # Using the same data as examples 2.1, 2.2, and 2.3, but now treating the exposure
  # as an ordered categorical variable. To account for the exposure being a
  # factor we just need to modify the shift function (and the original data)
  # so as to respect this.
  tmp <- sim_t4
  for (i in A) {
    tmp[[i]] <- factor(tmp[[i]], levels = 0:5, ordered = TRUE)
  }

  policy <- function(data, trt) {
    out <- list()
    a <- data[[trt]]
    for (i in 1:length(a)) {
      if (as.character(a[i]) %in% c("0", "1")) {
        out[[i]] <- as.character(a[i])
      } else {
        out[[i]] <- as.numeric(as.character(a[i])) - 1
      }
    }
    factor(unlist(out), levels = 0:5, ordered = TRUE)
  }

  lmtp_sub(tmp, A, "Y", time_vary = L, shift = policy, k = 0, folds = 2)

  # Example 3.1
  # Longitudinal setting, time-varying binary treatment, time-varying covariates
  # and baseline covariates with no loss-to-follow-up. Interested in a "traditional"
  # causal effect where treatment is set to 1 at all time points for all observations.
  if (require("twang")) {
    data("iptwExWide", package = "twang")

    A <- paste0("tx", 1:3)
    W <- c("gender", "age")
    L <- list(c("use0"), c("use1"), c("use2"))

    lmtp_sub(iptwExWide, A, "outcome", baseline = W, time_vary = L,
             shift = static_binary_on, outcome_type = "continuous")
  }

  # Example 4.1
  # Longitudinal setting, time-varying continuous treatment, time-varying covariates,
  # binary outcome with right censoring. Interested in the mean population outcome under
  # the observed exposures in a hypothetical population with no loss-to-follow-up.
  head(sim_cens)

  A <- c("A1", "A2")
  L <- list(c("L1"), c("L2"))
  C <- c("C1", "C2")
  Y <- "Y"

  lmtp_sub(sim_cens, A, Y, time_vary = L, cens = C, shift = NULL, folds = 2)

  # Example 5.1
  # Time-to-event analysis with a binary time-invariant exposure. Interested in
  # the effect of treatment being given to all observations on the cumulative
  # incidence of the outcome.
  # For a survival problem, the outcome argument now takes a vector of outcomes
  # if an observation experiences the event prior to the end of follow-up, all future
  # outcome nodes should be set to 1 (i.e., last observation carried forward).
  A <- "trt"
  Y <- paste0("Y.", 1:6)
  C <- paste0("C.", 0:5)
  W <- c("W1", "W2")

  lmtp_sub(sim_point_surv, A, Y, W, cens = C, folds = 2,
           shift = static_binary_on, outcome_type = "survival")


lmtp documentation built on July 26, 2023, 5:33 p.m.