knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
  )
library(kableExtra)

augsynth: The Augmented Synthetic Control Method

Installation

You can install augsynth from github using devtools.

## Install devtools if noy already installed
install.packages("devtools", repos='http://cran.us.r-project.org')
## Install augsynth from github
devtools::install_github("ebenmichael/augsynth")

Example: Effects of the 2012 Kansas Tax Cuts

The data

To show the usage and features of augsynth, we'll use data on the impact of personal income tax cuts in Kansas that comes with the AugSynth package. Our interest is in estimating the effect of income tax cuts on gross state product (GSP) per capita.

library(magrittr)
library(dplyr)
library(augsynth)
data(kansas)

The kansas dataset contains the GSP per capita (the outcome measure) lngdpcapita for all 50 states from the first quarter of 1990 to the first quarter of 2016.

To run augsynth, we need to include a treatment status column that indicates which region was treated and at what time. The table in kansas contains the column treated to denote this. In the original study, the second quarter of 2012 was the implementation of the tax cut in Kansas.

kansas %>% 
  select(year, qtr, year_qtr, state, treated, gdp, lngdpcapita) %>% 
  filter(state == "Kansas" & year_qtr >= 2012 & year_qtr < 2013) 

Synth

Now to find a synthetic control using the entire series of pre-intervention outcomes (and no auxiliary covariates), we can use augsynth. To do so we just need to give augsynth a formula like outcome ~ treatment, tell it what the unit and time variables are, optionally provide when intervention took place (the code will automatically determine this if t_int is not provided), and specify that we don't want to fit an outcome model

library(augsynth)
syn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
                progfunc = "None", scm = T)

We can then look at the ATT estimates for each post-intervention time period and overall. We'll also see the quality of the synthetic control fit measured by the L2 distance between Kansas and its synthetic control, and the percent improvement over uniform weights. By default, we'll also see pointwise confidence intervals using a conformal inference procedure.

summary(syn)

The default test statistic is the sum of the absolute treatment efects function(x) sum(abs(x)). We can change the test statistic via the stat_func argument. For instance, if we want to perform a one-way test against postive effects, we can set the test stastic to be the negative sum function(x) -sum(x):

summary(syn, stat_func = function(x) -sum(x))

Or if we want to priotize testing the average post-treatment effect, we can set it to be the absolute sum:

summary(syn, stat_func = function(x) abs(sum(x)))

It's easier to see this information visually. Below we plot the difference between Kansas and it's synthetic control. Before the tax cuts (to the left of the dashed line) we expect these to be close, and after the tax cuts we measure the effect (with point-wise confidence intervals).

plot(syn)

We can also compute point-wise confidence intervals using the Jackknife+ procedure by changing the inf_type argument, although this requires additional assumptions.

plot(syn, inf_type = "jackknife+")

Augmenting synth with an outcome model

In this example the pre-intervention synthetic control fit has an L2 imbalance of 0.083, about 20% of the imbalance between Kansas and the average of the other states. We can reduce this by augmenting synth with ridge regression. To do this we change progfunc to "Ridge". We can also choose the ridge hyper-parameter by setting lambda, while not specifying lambda will determine one through cross validation:

asyn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas,
                progfunc = "Ridge", scm = T)

We can plot the cross-validation MSE when dropping pre-treatment time periods by setting cv = T in the plot function:

plot(asyn, cv = T)

By default, the CV procedure chooses the maximal value of lambda with MSE within one standard deviation of the minimal MSE. To instead choose the lambda that minizes the cross validation MSE, set min_1se = FALSE.

We can look at the summary and plot the results. Now in the summary output we see an estimate of the overall bias of synth; we measure this with the average amount that augmentation changes the synth estimate. Notice that the estimates become somewhat larger in magnitude, and the standard errors are tighter.

summary(asyn)
plot(asyn)

There are also several auxiliary covariates. We can include these in the augmentation by fitting an outcome model using the auxiliary covariates. To do this we simply add the covariates into the formula after |. By default this will create time invariant covariates by averaging the auxiliary covariates over the pre-intervention period, dropping NA values. We can use a custom aggregation function by setting the cov_agg argument. Then the lagged outcomes and the auxiliary covariates are jointly balanced by SCM and the ridge outcome model includes both.

covsyn <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
                                           log(revlocalcapita) + log(avgwklywagecapita) +
                                           estabscapita + emplvlcapita,
                   fips, year_qtr, kansas,
                   progfunc = "ridge", scm = T)

Again we can look at the summary and plot the results.

summary(covsyn)
plot(covsyn)

Now we can additionally fit ridge ASCM on the residuals, look at the summary, and plot the results.

covsyn_resid <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
                                           log(revlocalcapita) + log(avgwklywagecapita) +
                                           estabscapita + emplvlcapita,
                   fips, year_qtr, kansas,
                   progfunc = "ridge", scm = T, lambda = asyn$lambda,
                   residualize = T)
summary(covsyn_resid)
plot(covsyn_resid)

Finally, we can augment synth with many different outcome models. The simplest outcome model is a unit fixed effect model, which we can include by setting fixedeff = T.

desyn <- augsynth(lngdpcapita ~ treated,
                   fips, year_qtr, kansas,
                   progfunc = "none", scm = T,
                   fixedeff = T)
summary(desyn)
plot(desyn)

We can incorproate other outcome models by changing the progfunc. Several outcome models are available, including, fitting the factor model directly with gsynth, general elastic net regression, bayesian structural time series estimation with CausalImpact, and matrix completion with MCPanel. For each outcome model you can supply an optional set of parameters, see documentation for details.



ebenmichael/augsynth documentation built on March 20, 2024, 5:20 a.m.