tmle: Targeted maximum likeilhood estimators (TMLE)

View source: R/tmle.R

tmleR Documentation

Targeted maximum likeilhood estimators (TMLE)

Description

Functions to estimate the parallel trends g-formula using double robust targeted maximum likelihood estimators.

Usage

tmle(
  df_obs,
  df_interv,
  den_formula,
  inside_formula_t,
  inside_formula_tmin1,
  outside_formula,
  Tt,
  t_col,
  id,
  n_nested = Tt,
  den_family = "binomial",
  inside_family = "gaussian",
  pt_link_fun = NULL,
  binomial_n = NULL,
  tibble = TRUE,
  models = TRUE,
  long = FALSE,
  suppress_rank_warnings = FALSE
)

Arguments

df_obs

Data frame with one row per individual-period (if long=TRUE) or one row per individual (if long=FALSE)

df_interv

Data frame with same dimensions as df_obs, but with exposure variables set to intervened values

den_formula

chr. glue-style formula for denominator model(s)

inside_formula_t

chr, glue-style right-hand-side formula for inside model for Yt

inside_formula_tmin1

chr, glue-style right-hand-side formula for inside model for Yt-1

outside_formula

chr, glue-style right-hand-side formula for outside models

Tt

int. max periods

t_col

integer vector of length equal to nrow(df_obs). Column denoting periods; only needed if long=TRUE.

id

chr. name of column in df_obs corresponding to unit identifier (only need it long=TRUE)

n_nested

int. How many nested expectations should be estimated, starting from the innermost (=0) to the outermost (=Tt)?

den_family

stats::family object or string referring to one, as in glm, for denominator model(s).

inside_family

stats::family object or string referring to one, as in glm, for innnermost model(s).

pt_link_fun

function. The scale on which parallel trends is assumed (e.g., qlogis for logit scale). Default NULL for untransformed (identity) scale.

binomial_n

int length nrow(data). Group sizes for binomial aggregate data.

tibble

logical. return results as a tibble (TRUE) or vector (FALSE)?

models

lgl. Return all models as an attribute?

long

lgl. Is df_obs wide (FALSE, default) or long (TRUE) format?

suppress_rank_warnings

lgl. Rank deficient models are often expected in this setting. Option to turn off warning 'prediction from a rank-deficient fit may be misleading'

Value

tibble with Tt rows and 2 columns. Column estimate contains estimates of counterfactual trends from t-1 to t. I.e., these are estimates of g\{E[Y_t(\bar a^*)]\} - g\{E[Y_{t-1}(\bar a^*)]\}, where g\{\cdot\} is the parallel trends link function specified by pt_link_fun.

Examples

Tt = 3
N = 100
Beta = generate_parameters(Tt=Tt)
df_obs = generate_data(N, Tt, Beta)
df_interv = df_obs %>% dplyr::mutate(A1=0, A2=0, A3=0)
tmle(df_obs=df_obs,
    df_interv = df_interv,
    den_formula = 'A{t}~A{t-1}*(W1{t}+W2{t}+I(W2{t}^2))',
    inside_formula_t = '~A{t}*(W1{t}+W2{t}+I(W2{t}^2))',
    inside_formula_tmin1 = '~A{t}*(W1{t-1}+W2{t-1}+I(W2{t-1}^2))',
    outside_formula = '~A{k}*(W1{k}+W2{k}+I(W2{k}^2))',
    Tt=Tt)

audreyrenson/paralleltrends documentation built on May 4, 2022, 2:53 a.m.