estADMG: Estimate the average causal effect (ACE) under an ADMG.

View source: R/ADMGtmle.R

estADMGR Documentation

Estimate the average causal effect (ACE) under an ADMG.

Description

The main user-facing function of the package. Given a causal graph specified as an acyclic directed mixed graph (ADMG), this function automatically determines the identifiability status of the treatment effect and dispatches to the appropriate estimator:

  • If the treatment is fixable (i.e., backdoor-adjustable), estimation proceeds via .call_backdoor, returning G-computation, IPW, one-step (AIPW), and TMLE estimators.

  • If the treatment is primal fixable (extended front-door functional), estimation proceeds via .call_nps, returning one-step and TMLE estimators.

  • If the treatment is neither fixable nor primal fixable, the function stops with an error.

A message is also printed indicating whether the graph is nonparametrically saturated, in which case the returned estimators are semiparametrically efficient.

Usage

estADMG(
  a = NULL,
  data = NULL,
  vertices = NULL,
  di_edges = NULL,
  bi_edges = NULL,
  treatment = NULL,
  outcome = NULL,
  multivariate.variables = NULL,
  graph = NULL,
  superlearner.seq = FALSE,
  superlearner.Y = FALSE,
  superlearner.A = FALSE,
  superlearner.M = FALSE,
  superlearner.L = FALSE,
  crossfit = FALSE,
  K = 5,
  ratio.method.L = "bayes",
  ratio.method.M = "bayes",
  dnorm.formula.L = NULL,
  dnorm.formula.M = NULL,
  lib.seq = c("SL.glm", "SL.earth", "SL.ranger", "SL.mean"),
  lib.L = c("SL.glm", "SL.earth", "SL.ranger", "SL.mean"),
  lib.M = c("SL.glm", "SL.earth", "SL.ranger", "SL.mean"),
  lib.Y = c("SL.glm", "SL.earth", "SL.ranger", "SL.mean"),
  lib.A = c("SL.glm", "SL.earth", "SL.ranger", "SL.mean"),
  formulaY = "Y ~ .",
  formulaA = "A ~ .",
  linkY_binary = "logit",
  linkA = "logit",
  n.iter = 500,
  cvg.criteria = 0.01,
  truncate_lower = 0,
  truncate_upper = 1,
  zerodiv.avoid = 0
)

Arguments

a

Numeric scalar or length-two numeric vector specifying the treatment level(s) of interest. The treatment must be coded as 0/1. If a scalar, the function returns E\{Y(a)\}. If a length-two vector c(a1, a0), the function returns the contrast E\{Y(a1)\} - E\{Y(a0)\}.

data

A data frame containing all variables listed in vertices.

vertices

A character vector of variable names in the causal graph. Ignored if graph is provided.

di_edges

A list of length-two character vectors specifying directed edges. For example, list(c('A', 'B')) encodes A -> B. Ignored if graph is provided.

bi_edges

A list of length-two character vectors specifying bidirected edges. For example, list(c('A', 'B')) encodes A <-> B. Ignored if graph is provided.

treatment

A character string naming the binary (0/1) treatment variable in data.

outcome

A character string naming the outcome variable in data.

multivariate.variables

A named list mapping compound vertex names to their column names in data. For example, list(M = c('M.1', 'M.2')) indicates M is bivariate with columns M.1 and M.2. Ignored if graph is provided.

graph

A graph object created by make.graph. If supplied, vertices, di_edges, bi_edges, and multivariate.variables are ignored.

superlearner.seq

Logical. If TRUE, SuperLearner is used for sequential regression of intermediate variables (primal fixable case only). Default is FALSE.

superlearner.Y

Logical. If TRUE, SuperLearner is used for outcome regression. Default is FALSE.

superlearner.A

Logical. If TRUE, SuperLearner is used for propensity score estimation. Default is FALSE.

superlearner.M

Logical. If TRUE, SuperLearner is used for density ratio estimation for variables in M via the Bayes method (primal fixable case only). Default is FALSE.

superlearner.L

Logical. If TRUE, SuperLearner is used for density ratio estimation for variables in L via the Bayes method (primal fixable case only). Default is FALSE.

crossfit

Logical. If TRUE, cross-fitting with K folds is applied to all SuperLearner fits. Default is FALSE.

K

A positive integer specifying the number of cross-fitting folds. Used only when crossfit = TRUE. Default is 5.

ratio.method.L

A character string specifying the method for estimating density ratios for variables in L (primal fixable case only). Options are:

"bayes"

(Default) Rewrites the ratio via Bayes' rule as [p(A=a_0|L,mp(L))/p(A=a_1|L,mp(L))] / [p(A=a_0|mp(L))/p(A=a_1|mp(L))] and estimates each factor via logistic regression or SuperLearner.

"dnorm"

Assumes L | mp(L), A is Gaussian (continuous L) or Bernoulli (binary L), estimated via linear or logistic regression with linear terms only.

"densratio"

Uses the densratio package. Supports only numeric/integer variables and is computationally expensive; not recommended for graphs with many variables.

ratio.method.M

A character string specifying the method for estimating density ratios for variables in M (primal fixable case only). Same options as ratio.method.L. Default is "bayes".

dnorm.formula.L

An optional named list of regression formulas for variables in L, used when ratio.method.L = "dnorm". Names are variable names; values are formula strings. Variables omitted from the list are regressed on all Markov pillow variables. For multivariate L, specify one formula per component, e.g. list(L.1 = "L.1 ~ A + X", L.2 = "L.2 ~ A + X + I(M^2)").

dnorm.formula.M

An optional named list of regression formulas for variables in M, used when ratio.method.M = "dnorm". Same structure as dnorm.formula.L.

lib.seq

SuperLearner library for sequential regression. Default is c("SL.glm", "SL.earth", "SL.ranger", "SL.mean").

lib.L

SuperLearner library for density ratio estimation for L. Default is c("SL.glm", "SL.earth", "SL.ranger", "SL.mean").

lib.M

SuperLearner library for density ratio estimation for M. Default is c("SL.glm", "SL.earth", "SL.ranger", "SL.mean").

lib.Y

SuperLearner library for outcome regression. Default is c("SL.glm", "SL.earth", "SL.ranger", "SL.mean").

lib.A

SuperLearner library for propensity score estimation. Default is c("SL.glm", "SL.earth", "SL.ranger", "SL.mean").

formulaY

A formula or character string for outcome regression of Y on its Markov pillow. Used only when superlearner.Y = FALSE. Default is "Y ~ .".

formulaA

A formula or character string for propensity score regression of A on its Markov pillow. Used only when superlearner.A = FALSE. Default is "A ~ .".

linkY_binary

A character string specifying the link function for outcome regression when Y is binary and superlearner.Y = FALSE. Default is "logit".

linkA

A character string specifying the link function for propensity score regression when superlearner.A = FALSE. Default is "logit".

n.iter

Maximum number of TMLE iterations. Default is 500.

cvg.criteria

Numeric. TMLE convergence threshold. The iterative update stops when |\text{mean}(D^*)| < cvg.criteria. Default is 0.01.

truncate_lower

Numeric. Propensity score values below this threshold are clipped. Default is 0 (no clipping).

truncate_upper

Numeric. Propensity score values above this threshold are clipped. Default is 1 (no clipping).

zerodiv.avoid

Numeric. Density ratio or propensity score values below this threshold are clipped to prevent division by zero. Default is 0 (no clipping).

Value

The return structure depends on the identifiability path:

Fixable (backdoor)

A named list with components TMLE, Onestep, IPW, and Gcomp, plus per-treatment-level sub-lists.

Primal fixable (front-door)

A named list with components TMLE and Onestep.

Examples

# Fixable graph: simple backdoor adjustment
test <- estADMG(
  a = 1,
  data = data_backdoor,
  vertices = c('A', 'Y', 'X'),
  di_edges = list(c('X', 'A'), c('X', 'Y'), c('A', 'Y')),
  treatment = 'A',
  outcome = 'Y'
)

# Primal fixable graph: extended front-door functional
test <- estADMG(
  a = 1,
  data = data_example_a,
  vertices = c('A', 'M', 'L', 'Y', 'X'),
  bi_edges = list(c('A', 'Y')),
  di_edges = list(c('X', 'A'), c('X', 'M'), c('X', 'L'),
                  c('X', 'Y'), c('M', 'Y'), c('A', 'M'),
                  c('A', 'L'), c('M', 'L'), c('L', 'Y')),
  treatment = 'A',
  outcome = 'Y',
  multivariate.variables = list(M = c('M.1', 'M.2'))
)

# ACE estimation E(Y(1)) - E(Y(0))
test <- estADMG(
  a = c(1, 0),
  data = data_example_a,
  vertices = c('A', 'M', 'L', 'Y', 'X'),
  bi_edges = list(c('A', 'Y')),
  di_edges = list(c('X', 'A'), c('X', 'M'), c('X', 'L'),
                  c('X', 'Y'), c('M', 'Y'), c('A', 'M'),
                  c('A', 'L'), c('M', 'L'), c('L', 'Y')),
  treatment = 'A',
  outcome = 'Y',
  multivariate.variables = list(M = c('M.1', 'M.2'))
)

flexCausal documentation built on March 29, 2026, 5:08 p.m.