exclude: true class: center, middle

FLAME: Fast, interpretable, and accurate methods for estimating causal effects from observational data


class: inverse, middle, center, hide-logo

The Methodology

library(RefManageR)
BibOptions(check.entries = FALSE,
           bib.style = "authoryear",
           cite.style = "authoryear",
           style = "markdown",
           hyperlink = FALSE,
           dashed = FALSE)
myBib <- ReadBib("./biblio.bibtex", check = FALSE)

Causal Inference

Goal: quantify effect of treatment $T \in {0, 1}$ on outcome $Y$

r NoCite(myBib, 'DAME', 'FLAME')

--

Two potential outcomes for each unit: ${Y_i(1), Y_i(0)}$, denoting response under treatment, control

--

--

--

In observational data, covariates $\mathbf{X} \in \mathbb{R}^p$ might be confounders

??? Can't naively compare the average outcome of control units to estimate a control counterfactual


(Exact) Matching

One approach to estimating counterfactuals under confounding: matching

--

If, for treated unit $i$, there existed control unit $k$ such that $\mathbf{x}_k = \mathbf{x}_i$, then:

-- - $Y_k$ (observed) is a good estimate of $Y_i(0)$ (unobserved)

--

But exact matches are unlikely in high dimensional settings

--


Almost Matching Exactly

Given a unit $i$, covariate weights $\mathbf{w}$, and a covariate selection vector $\boldsymbol{\theta}$, define the AME problem:

$$\overbrace{\text{argmax}{\boldsymbol{\theta} \in {0, 1}^p}\;\boldsymbol{\theta}^T\mathbf{w}}^{\text{most important covariate set}}\quad\text{s.t.}\\quad \exists k\;\:\text{with}\;\: \color{blue}{\underbrace{\mathbf{x}{k} \circ \boldsymbol{\theta} = \mathbf{x}{i} \circ \boldsymbol{\theta}}{\text{exact matching on }\boldsymbol{\theta}}} \;\:\text{and}\;\: \underbrace{T_{k} = 1 -T_i}_{\text{opposite treatment}}$$

Almost Matching Exactly

Given a unit $i$, covariate weights $\mathbf{w}$, and a covariate selection vector $\boldsymbol{\theta}$, define the AME problem:

$$\overbrace{\text{argmax}{\boldsymbol{\theta} \in {0, 1}^p}\;\boldsymbol{\theta}^T\mathbf{w}}^{\text{most important covariate set}}\quad\text{s.t.}\\quad \exists k\;\:\text{with}\;\: \underbrace{\mathbf{x}{k} \circ \boldsymbol{\theta} = \mathbf{x}{i} \circ \boldsymbol{\theta}}{\text{exact matching on }\boldsymbol{\theta}} \;\:\text{and}\;\: \color{blue}{\underbrace{T_{k} = 1 -T_i}_{\text{opposite treatment}}}$$


Almost Matching Exactly

Given a unit $i$, covariate weights $\mathbf{w}$, and a covariate selection vector $\boldsymbol{\theta}$, define the AME problem:

$$\color{blue}{\overbrace{\text{argmax}{\boldsymbol{\theta} \in {0, 1}^p}\;\boldsymbol{\theta}^T\mathbf{w}}^{\text{most important covariate set}}}\quad\text{s.t.}\\quad \exists k\;\:\text{with}\;\: \underbrace{\mathbf{x}{k} \circ \boldsymbol{\theta} = \mathbf{x}{i} \circ \boldsymbol{\theta}}{\text{exact matching on }\boldsymbol{\theta}} \;\:\text{and}\;\: \underbrace{T_{k} = 1 -T_i}_{\text{opposite treatment}}$$

--

Implicitly defines a distance metric that:

  1. Prioritizes matches on relevant covariates

-- 2. Matches exactly when possible

--

Iterate over covariate sets, starting with more important ones

-- exclude: true In practice, don't have $\mathbf{w}$; run ML algorithm on separate holdout set

??? Going to try and solve the AME problem for units. Way this is going to work in practice is that we're going to pick a theta, starting with a theta of all 1s, which corresponds to exact matching -- the best possible thing we can do -- and match all possible units. Then we're going to choose another theta, and match those units. Bc in practice we don't have fixed covariate weights, for each of these thetas, ..


Almost Matching Exactly: The Algorithms

--

DAME (Dynamic Almost Matching Exactly)

--

Solves the AME problem exactly for each unit

--

Efficient solution via downward closure property

--

FLAME (Fast, Large-scale Almost Matching Exactly)

--

Approximates the exact solution via backwards stepwise selection.

--

At each iteration, eliminate an entire covariate

??? Given all this background, it's now very natural and easy to explain two of our methods


Almost Matching Exactly: Dynamic Weights

In practice, don't have $\mathbf{w}$; run ML algorithm on separate holdout set

--

Compute Predictive Error ( $\mathtt{PE}$ ): error using covariate set to predict outcome

--

Determines next covariate set to match on


exclude: true

Almost Matching Exactly: Dynamic Weights

Oftentimes don't have a priori measures of covariate importance

-- exclude: true At every iteration, run ML algorithm on separate holdout set to model how well a covariate set predicts the outcome

-- exclude: true The Predictive Error ( $\mathtt{PE}$ ) measures the error in doing so and determines what covariate set next to match on.


exclude: true

Other Distance Metrics


class: inverse, middle, center, hide-logo

The Package


Overview of FLAME

FLAME and DAME are the workhorses of the package

Match input data under a wide variety of specifications

Efficient bit-vectors routine for making matches

Return S3 objects of class ame with print, plot, and summary methods


Installation

CRAN

install.packages('FLAME')

GitHub

library(devtools)

install_github('https://github.com/vittorioorlandi/FLAME')
# Or (mirror of the above)
install_github('https://github.com/almost-matching-exactly/R-FLAME')

Natality Data

library(FLAME)
natality_out <- readRDS('../natality/natality_out_500k_lm.rds')

US 2010 Natality Data r Citep(myBib, 'natality2010').

Data on neonatal health outcomes in Neonatal Intensive Care Unit (NICU)

Effect of "extreme smoking" ( $\geq 10$ cigarettes a day during pregnancy) on birth weight r Citep(myBib, 'kondracki2020').

Subset of ~500k observations with 16 covariates including sex of infant, races of parents, previous Cesarean deliveries, and others.


Missing Data

missing_data: how missing values in data to be matched are handled

-- - drop: effectively drop units with missingness from the data

-- - impute: impute missing values and match on complete dataset

-- - keep: keep missing values but do not match on them

--

missing_holdout is analogous, with impute and keep options


Computing Predictive Error

Two implemented options for computing $\mathtt{PE}$ - glmnet::cv.glmnet with 5-fold cross-validation (default) - xgboost::xgb.cv with 5-fold cross-validation

Supply your own function:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

Calling FLAME and DAME

Full call to use FLAME to match natality data:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

natality_out <-
  FLAME(data = natality, holdout = 0.25, replace = FALSE,
        treated_column_name = 'smokes10', 
        outcome_column_name = 'dbwt'
        missing_data = 'drop', missing_holdout = 'drop',
        PE_method = my_PE_lm,
        estimate_CATEs = TRUE)

Calling FLAME and DAME

Full call to use FLAME to match natality data:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

natality_out <-
  FLAME(data = natality, holdout = 0.25, replace = FALSE, #<<
        treated_column_name = 'smokes10', 
        outcome_column_name = 'dbwt'
        missing_data = 'drop', missing_holdout = 'drop',
        PE_method = my_PE_lm,
        estimate_CATEs = TRUE)

Calling FLAME and DAME

Full call to use FLAME to match natality data:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

natality_out <-
  FLAME(data = natality, holdout = 0.25, replace = FALSE,
        treated_column_name = 'smokes10', #<<
        outcome_column_name = 'dbwt' #<<
        missing_data = 'drop', missing_holdout = 'drop',
        PE_method = my_PE_lm,
        estimate_CATEs = TRUE)

Calling FLAME and DAME

Full call to use FLAME to match natality data:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

natality_out <-
  FLAME(data = natality, holdout = 0.25, replace = FALSE,
        treated_column_name = 'smokes10', 
        outcome_column_name = 'dbwt' 
        missing_data = 'drop', missing_holdout = 'drop', #<<
        PE_method = my_PE_lm,
        estimate_CATEs = TRUE)

Calling FLAME and DAME

Full call to use FLAME to match natality data:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

natality_out <-
  FLAME(data = natality, holdout = 0.25, replace = FALSE,
        treated_column_name = 'smokes10', 
        outcome_column_name = 'dbwt' 
        missing_data = 'drop', missing_holdout = 'drop',
        PE_method = my_PE_lm #<<
        estimate_CATEs = TRUE) 

Calling FLAME and DAME

Full call to use FLAME to match natality data:

my_PE_lm <- function(X, Y) {
  df <- as.data.frame(cbind(X, Y = Y))
  return(lm(Y ~ ., df)$fitted.values)
}

natality_out <-
  FLAME(data = natality, holdout = 0.25, replace = FALSE,
        treated_column_name = 'smokes10', 
        outcome_column_name = 'dbwt' 
        missing_data = 'drop', missing_holdout = 'drop',
        PE_method = my_PE_lm
        estimate_CATEs = TRUE) #<<

??? Estimate of the treatment effect for units that share certain covariate values


Methods: Print

print(natality_out, linewidth = 60, digits = 1)

Methods: Plot

plot(natality_out, which_plots = 1)

Methods: Plot

plot(natality_out, which_plots = 2)

Methods: Plot

plot(natality_out, which_plots = 3)

Methods: Plot

plot(natality_out, which_plots = 4)

Methods: Summary

# Just bc takes a while; nothing fishy :)
natality_summ <- readRDS('../natality/natality_summary.rds')
(natality_summ <- summary(natality_out))
print(natality_summ)

Examining Matched Groups

```{css, echo=FALSE} pre { background: #FFFFFF; max-width: 100%; overflow-x: scroll; }

```r
op <- options('width' = 250)
high_quality_MG <- 
  MG(natality_summ$MG$highest_quality[1], natality_out)[[1]]

head(high_quality_MG, n = 14)
options(op)

Conclusion

FLAME and DAME are scalable algorithms for observational causal inference

Use ML on a holdout set to learn a distance metric that prioritizes matches on more important covariates

Resulting matched groups are interpretable and high quality

Future work: - Database implementation - Algorithms for mixed data


class: middle, center

Thank You!

.center[] almost-matching-exactly.github.io ???

Documentation, links to papers, Python package


References

PrintBibliography(myBib)


almost-matching-exactly/R-FLAME documentation built on Jan. 4, 2022, 8:31 a.m.