knitr::opts_chunk$set(echo = TRUE)

causalsandwich for estimating causal effects with empirical sandwich variance estimator

Currently implemented as of version 0.0.0.9000: - Estimators of the average treament effect - IPTW - G-formula - Doubly Robust estimator

Why use causalsandwich? Easy implementation with valid inference!

What's geex?

geex is a package designed for easy implementation of estimating equations. The causalsandwich package is powered by geex

What are estimating equations?

See Stefanski and Boos (2002)

How to install

The package can be installed from Github:

devtools::install_github("BarkleyBG/causalsandwich")

Examples

Whenver possible, the causal effect estimate is listed last.

Estimating the Average Treatment Effect

First, some data:

library(causalsandwich)

n <- 200
data <- data.frame(
  Covar1 = rnorm(n),
  Covar2 = rnorm(n)
)
trtprobs <- plogis(0.2  + 0.5*data$Covar1 + 0.1*data$Covar2 + 0.2*data$Covar1 * data$Covar2)
data$BinaryTrt <- rbinom(n, 1, trtprobs)

outprobs <- plogis(0.5  + 0.2*data$Covar1 - 0.2*data$Covar2 + 0.2*data$Covar1 * data$Covar2 -0.2*data$BinaryTrt)
data$BinaryOutcome <- rbinom(n, 1, outprobs)

Defensive Programming

Until rigorous tests and defensive programming are added, users are recommended to use the following coding schemes:

Inverse Probability (of treament) Weighting

Estimating the Average Treatment Effect only takes specifying a few arguments. The below code will estimate propensity scores with the logistic GLM model and estimate the ATE with the Horwitz-Thompson IPTW estimator.

The formula specified is a mixture of two formulas. That is, BinaryTrt ~ Covar1*Covar2 will be passed into the logistic model to fit a model for treatment with main effects and interaction terms with the two covariates. The first section of the multi-part formula indicates that the column named BinaryOutcome in data corresponds to the outcome variable of interest.

formula <- BinaryOutcome | BinaryTrt ~ Covar1 * Covar2

IPTW <- estimateIPTW(
  data = data, 
  formula = formula 
) 

Output includes point estimates and the sandwich variance estimator:

(ests <- IPTW@estimates)
(vcov <- IPTW@vcov) 

The first four parameters correspond to the treatment model parameters. It is the fifth parameter that corresponds to the ATE. Thus, a point estimate and 95\% confidence interval for this parameter are:

param_num <- 5
ests[param_num]
ests[param_num] + stats::qnorm(c(0.025,0.975))*vcov[param_num,param_num]

Since the treatment model is correctly specified, the estimator is consistent (as is the variance estimator).

G-formula

outcome_regression_formula <- BinaryOutcome ~ BinaryTrt + Covar1 * Covar2

GF <- estimateGF(
  data = data,
  treatment_var_name = "BinaryTrt",
  formula = outcome_regression_formula, 
  model_method = "logistic" 
) 

(ests <- GF@estimates)
(vcov <- GF@vcov) 

An estimate and 95\% confidence interval is then

param_num <- 6
ests[param_num]
ests[param_num] + stats::qnorm(c(0.025,0.975))*vcov[param_num,param_num]

Doubly-Robust Estimation

outcome_regression_formula <- BinaryOutcome ~ BinaryTrt + Covar1 * Covar2
treatment_model_formula <- BinaryTrt ~ Covar1 * Covar2

DRIPTW <- estimateDRIPTW(
  data = data,
  outcome_formula = outcome_regression_formula,
  treatment_formula = treatment_model_formula,
  outcome_model_method = "logistic",
  treatment_model_method = "logistic",
  deriv_control = geex::setup_deriv_control(method="simple")
)

(ests <- DRIPTW@estimates)
(vcov <- DRIPTW@vcov)

An estimate and 95\% confidence interval is then

param_num <- 10
ests[param_num]
ests[param_num] + stats::qnorm(c(0.025,0.975))*vcov[param_num,param_num]


BarkleyBG/causalsandwich documentation built on May 28, 2019, 11:36 a.m.