This chapter, authored by Ziyi Liu and Yiqing Xu, complements @CLLX2025 (paper, slides). Rivka Lipkovitz also contributed to this tutorial.
Download the R code used in this chapter here.
In recent years, researchers have proposed various heterogeneous treatment effect (HTE) robust estimators for causal panel analysis under parallel trends (PT) as alternatives to traditional two-way fixed effects (TWFE) models. Examples include those proposed by @CDLZ, @sun2021-event, @callaway2021-did, @CDH2020, @IKW2023, @BJS2024, and @LWX2022. These methods are closely connected to the classic difference-in-differences (DID) estimator.
This chapter will guide you through implementing these HTE-robust estimators, as well as TWFE, in R. It will also provide instructions on creating event study plots to display estimated dynamic treatment effects. In the process, we will present a recommended pipeline for analyzing panel data, covering data exploration, estimation, result visualization, and diagnostic tests.
We first illustrate these methods with two empirical examples: @HH2019 (without treatment reversals) and @GS2020 (with treatment reversals). Then, we demonstrate how to implement the sensitivity analysis proposed by @rambachan2023more using the imputation estimator and data from the first example.
To begin, you will need to install the necessary packages from CRAN and GitHub.
# install packages from CRAN packages <- c("dplyr", "fixest", "did", "didimputation", "panelView", "ggplot2", "bacondecomp", "HonestDiD", "DIDmultiplegtDYN", "PanelMatch") install.packages(setdiff(packages, rownames(installed.packages()))) # install most up-to-date "fect" from Github if ("fect" %in% rownames(installed.packages()) == FALSE) { devtools:: install_github("xuyiqing/fect") } # install forked "HonestDiD" package compatible with "fect" if ("HonestDiDFEct" %in% rownames(installed.packages()) == FALSE) { devtools:: install_github("lzy318/HonestDiDFEct") }
Load libraries:
library(dplyr) library(readstata13) library(fixest) library(did) library(fect) library(panelView) library(PanelMatch) library(ggplot2) library(bacondecomp) library(fect) library(didimputation) library(doParallel) library(HonestDiD) library(HonestDiDFEct) library(DIDmultiplegtDYN) # may require XQuartz
We begin with an empirical example from @HH2019, who investigate the effects of indirect democracy versus direct democracy on naturalization rates in Switzerland using municipality-year panel data from 1991 to 2009. The study finds that switching from direct to indirect democracy increased naturalization rates by an average of 1.22 percentage points (Model 1, Table 1).
data(fect) data <- hh2019 head(data)
First, we examine the evolution of treatment status using the panelView package. The variables bfs
and year
represent the unit and time indicators, respectively. The treatment variable is indirect
, and the outcome is nat_rate_ord
. In the baseline analysis, we exclude covariates. We set by.timing = TRUE
to sort units by the timing of treatment receipt.
panelview(nat_rate_ord ~ indirect, data = data, index = c("bfs","year"), xlab = "Year", ylab = "Unit", display.all = T, gridOff = TRUE, by.timing = TRUE)
The treatment status plot shows that treatment is introduced in different years across municipalities, with no treatment reversals (a staggered adoption setting). Following @sun2021-event, we define units that receive treatment in the same year as a cohort. We can then use panelView to plot the average outcomes for each cohort. Because there are many cohorts, the figure is not as informative as desired.
panelview(data = data,Y='nat_rate_ord', D='indirect',index=c("bfs","year"), by.timing = TRUE, display.all = TRUE, type = "outcome", by.cohort = TRUE)
The staggered adoption setup allows us to implement several estimators to obtain the treatment effect estimates. We will first estimate a two-way fixed-effects (TWFE) model, as the authors do in the paper: $$Y_{it} = \alpha_i + \xi_t + \delta^{TWFE}D_{it} + \epsilon$$ We can implement this estimator using thefeols
function in the fixest package. The estimated coefficient using TWFE is 1.339, with a standard error of 0.187.
# remember to cluster standard errors model.twfe.0 <- feols(nat_rate_ord~indirect|bfs+year, data=data, cluster = "bfs") print(model.twfe.0)
@goodman2021difference shows that the TWFE estimator in a staggered adoption setting can be expressed as a weighted average of all possible 2×2 DID estimates across different cohorts. However, when treatment effects vary over time across cohorts, forbidden comparisons—where post-treatment data from early adopters serve as controls for late adopters—can introduce bias into the TWFE estimator. To address this, we follow the procedure in @goodman2021difference and decompose the TWFE estimate using the bacon
function from the bacondecomp package.
data.complete <- data[which(!is.na(data$nat_rate_ord)),] # bacon requires no missingness in the data df_bacon <- bacon(nat_rate_ord~indirect, data = data.complete, id_var = "bfs", time_var = "year") ggplot(df_bacon) + aes(x = weight, y = estimate, shape = factor(type), color = factor(type)) + labs(x = "Weight", y = "Estimate", shape = "Type", color = 'Type') + geom_point() print(aggregate(df_bacon$estimate * df_bacon$weight, list(df_bacon$type), FUN=sum))
The decomposition of the TWFE estimator in a staggered adoption setting, as shown by @goodman2021difference, reveals that the largest contribution to the TWFE estimate comes from DIDs comparing ever-treated cohorts switching into treatment with the never-treated (purple crosses labeled "Treated vs Untreated"). The second-largest contribution comes from DIDs comparing ever-treated cohorts switching into treatment with other ever-treated cohorts still in their pretreatment periods (red dots labeled "Earlier vs Later Treated"), followed by forbidden comparisons between ever-treated cohorts and the always-treated (green triangles labeled "Later vs Always Treated"). The smallest contribution comes from "forbidden" DIDs comparing ever-treated cohorts switching into treatment with other ever-treated cohorts already treated (blue squares labeled "Later vs Earlier Treated").
Before applying HTE-robust estimators, we first remove the always-treated units for easier comparison. Then, we create an event study plot using a dynamic specification of TWFE.
# drop always treated units df <- as.data.frame(data %>% group_by(bfs) %>% mutate(treatment_mean = mean(indirect,na.rm = TRUE))) df.use <- df[which(df$treatment_mean<1),] # Re-estimate TWFE on this Sub-sample model.twfe.1 <- feols(nat_rate_ord~indirect|bfs+year, data=df.use, cluster = "bfs") print(model.twfe.1)
After removing the always-treated units, the estimated treatment effect increases to 1.609, with a standard error of 0.195.
To create an event study plot, we first generate the cohort index and the relative time period for each observation in relation to treatment. These details are essential for various HTE-robust estimators, such as those in @sun2021-event and @callaway2021-did. We use the get.cohort
function in fect to generate these indices. To maintain consistency with canonical TWFE regressions, we set start0 = TRUE
, designating period 0 as the first post-treatment period and period -1 as the last pre-treatment period.
df.use <- get.cohort(df.use, D = "indirect", index=c("bfs","year"), start0 = TRUE) head(df.use[,-5],19)
The column FirstTreat
indicates the first period when a unit receives treatment (set to NA
for never-treated units). The column Time_to_Treatment
represents the relative period to treatment onset (set to NA
for always-treated and never-treated units). The period -1
corresponds to the last pre-treatment period.
Now, we run a dynamic TWFE regression, also known as a model with "leads and lags." This is the standard approach for estimating dynamic treatment effects in event studies. The model includes a series of interaction terms between a dummy indicating whether a unit is treated and each lead and lag indicator relative to treatment in a TWFE regression. This specification allows effects to vary over time, typically using the period immediately before treatment as the reference period. If the regression is fully saturated and there is no HTE across cohorts, the dynamic TWFE model can consistently estimate the dynamic treatment effect.
We use feols
to implement this estimator. The column Time_to_Treatment
can serve as the lead/lag indicator (we need to replace the NA
in Time_to_Treatment
with an arbitrary number). Note that the period -1
in Time_to_Treatment
corresponds to the last pre-treatment period and is the reference period.
# Dynamic TWFE df.twfe <- df.use # drop always treated units df.twfe$treat <- as.numeric(df.twfe$treatment_mean>0) df.twfe[which(is.na(df.twfe$Time_to_Treatment)),'Time_to_Treatment'] <- 0 # can be an arbitrary value twfe.est <- feols(nat_rate_ord ~ i(Time_to_Treatment, treat, ref = -1)| bfs + year, data = df.twfe, cluster = "bfs") twfe.output <- as.matrix(twfe.est$coeftable) print(round(twfe.output, 3))
fect includes a plotting function, esplot()
, to create an event study plot based on a vector of point estimates and estimated standard errors. Here, we use this function to examine dynamic effects over a 13-period pre-treatment span and a 10-period post-treatment span. The data reveals a significant treatment effect, along with a subtle pre-trend discrepancy in several periods preceding treatment. Notably, esplot()
treats period 0 as the last pre-treatment period by default, so the period index must be shifted by one.
twfe.output <- as.data.frame(twfe.output) twfe.output$Time <- c(c(-18:-2),c(0:17))+1 p.twfe <- esplot(twfe.output,Period = 'Time',Estimate = 'Estimate', SE = 'Std. Error', xlim = c(-12,10)) p.twfe
By setting start0 = TRUE
, we designate period 0 as the first post-treatment period, making period -1 the last pre-treatment period. As a result, there is no need to shift the period index, as it already aligns with the conventional TWFE framework.
twfe.output <- as.data.frame(twfe.est$coeftable) twfe.output$Time <- c(c(-18:-2),c(0:17)) p.twfe <- esplot(twfe.output, Period = 'Time', Estimate = 'Estimate', SE = 'Std. Error', xlim = c(-12,10),start0 = TRUE) p.twfe
@CDLZ examine the effect of variations in minimum wage on low-wage employment across 138 state-level minimum wage changes in the United States between 1979 and 2016, using a stacked DID approach. A similar estimator is also implemented in the Stata package STACKEDEV [@STACKEDEV]. See a simple illustration below.
The stacked DID method removes the bias caused by forbidden comparisons in the Goodman-Bacon decomposition. However, it returns a variance-weighted ATT with the implicit weights chosen by OLS instead of the ATT.
df.st <- NULL target.cohorts <- setdiff(unique(df.use$Cohort),"Control") k <- 1 for(cohort in target.cohorts){ df.sub <- df.use[which(df.use$Cohort%in%c(cohort,"Control")),] df.sub$stack <- k df.st <- rbind(df.st,df.sub) k <- k + 1 } df.st$st_unit <- as.numeric(factor(paste0(df.st$stack,'-',df.st$bfs))) df.st$st_year <- as.numeric(factor(paste0(df.st$stack,'-',df.st$year))) model.st <- feols(nat_rate_ord~indirect|st_unit+st_year, data=df.st, cluster = "st_unit") print(model.st)
We then make an event study plot using a dynamic specification.
df.st$treat <- as.numeric(df.st$treatment_mean>0) df.st[which(is.na(df.st$Time_to_Treatment)),'Time_to_Treatment'] <- 1000 # note, this "1000" can be arbitrary value st.est <- feols(nat_rate_ord ~ i(Time_to_Treatment, treat, ref = -1)| st_unit + st_year,data = df.st,cluster = "st_unit") # make plot st.output <- as.data.frame(st.est$coeftable) st.output$Time <- c(c(-18:-2),c(0:17))+1 p.st <- esplot(st.output,Period = 'Time',Estimate = 'Estimate', SE = 'Std. Error', xlim = c(-12,10)) p.st
The stacked DID plot closely resembles that of the TWFE model, suggesting that potential bias from forbidden comparisons is minimal.
@sun2021-event proposes an interaction-weighted (IW) estimator for estimating the ATT. the IW estimator computes a weighted average of ATT estimates for each cohort, obtained from a TWFE regression where cohort dummies are fully interacted with indicators of relative time to treatment onset. See a simple illustration below.
knitr::include_graphics("fig/fig_iw.png")
The IW estimator is HTE-robust and can be implemented using the sunab
function in the fixest package. The column FirstTreat
serves as the cohort indicator, with any missing values replaced by an arbitrary number. The estimated ATT using the IW estimator is 1.331, with a standard error of 0.288.
df.sa <- df.use df.sa[which(is.na(df.sa$FirstTreat)),"FirstTreat"] <- 1000 # above, replace NA with an arbitrary number model.sa.1 <- feols(nat_rate_ord~sunab(FirstTreat,year)|bfs+year, data = df.sa, cluster = "bfs") summary(model.sa.1,agg = "ATT")
The estimation results are saved coeftable
for plotting. We can make an event study plot as before. The results are quite similar to TWFE and stacked DID.
sa.output <- as.data.frame(as.matrix(model.sa.1$coeftable)) sa.output$Time <- c(c(-18:-2),c(0:17)) + 1 p.sa <- esplot(sa.output,Period = 'Time',Estimate = 'Estimate', SE = 'Std. Error', xlim = c(-12,10)) p.sa
@callaway2021-did propose a doubly robust estimator, known as CSDID, which incorporates pre-treatment covariates by using either never-treated or not-yet-treated units as the comparison group. Due to lack of time-invariant covariates, we use only the outcome model (rather than the full doubly robust estimator) to estimate the ATT and dynamic treatment effects.
First, we use the never-treated units as the comparison group. The column FirstTreat
serves as the cohort indicator, with missing values replaced by 0. The estimated ATT is numerically identical to the IW estimator. Additionally, we set bstrap = FALSE
to report the analytical standard error, which is 0.302.
df.cs <- df.use df.cs[which(is.na(df.cs$FirstTreat)),"FirstTreat"] <- 0 # replace NA with 0 cs.est.1 <- att_gt(yname = "nat_rate_ord", gname = "FirstTreat", idname = "bfs", tname = "year", xformla = ~1, control_group = "nevertreated", allow_unbalanced_panel = TRUE, data = df.cs, est_method = "reg") cs.est.att.1 <- aggte(cs.est.1, type = "simple", na.rm=T, bstrap = F) print(cs.est.att.1)
Using the csdid package, we can use the aggte
function to aggregate cohort-period average treatment effects. We set bstrap = FALSE
to report analytical standard errors and cband = FALSE
to display period-wise confidence intervals, allowing for easy comparison with other estimators. While aggte
can compute uniform confidence bands by setting both cband
and bstrap
to TRUE
, we opt to report period-wise intervals in this case.
cs.att.1 <- aggte(cs.est.1, type = "dynamic", bstrap=FALSE, cband=FALSE, na.rm=T) print(cs.att.1)
The estimated dynamic treatment effects in the post-treatment period are numerically identical to those from the IW estimator. However, pre-treatment estimates may differ because csdid defaults to using the preceding period as the reference point in pre-treatment periods.
cs.output <- cbind.data.frame(Estimate = cs.att.1$att.egt, SE = cs.att.1$se.egt, time = cs.att.1$egt + 1) p.cs.1 <- esplot(cs.output,Period = 'time',Estimate = 'Estimate', SE = 'SE', xlim = c(-12,10)) p.cs.1
We prefer to set base_period = "universal"
, which designates the last pre-treatment period as the reference period. This produces results that closely align with the IW estimator.
cs.est.1.u <- att_gt(yname = "nat_rate_ord", gname = "FirstTreat", idname = "bfs", tname = "year", xformla = ~1, control_group = "nevertreated", allow_unbalanced_panel = TRUE, data = df.cs, est_method = "reg", base_period = "universal") cs.att.1.u <- aggte(cs.est.1.u, type = "dynamic", bstrap=FALSE, cband=FALSE, na.rm=T) cs.output.u <- cbind.data.frame(Estimate = cs.att.1.u$att.egt, SE = cs.att.1.u$se.egt, time = cs.att.1.u$egt + 1) p.cs.1.u <- esplot(cs.output.u,Period = 'time',Estimate = 'Estimate', SE = 'SE', xlim = c(-12,10)) p.cs.1.u
One advantage of CSDID is the ability to use not-yet-treated units as the comparison group by setting control_group = "notyettreated"
. See a simple illustration below.
knitr::include_graphics("fig/fig_cs.png")
The choice of the comparison group depends on the specific version of the parallel trends assumption that researchers aim to justify. The estimated ATT is 1.292, with a standard error of 0.308.
cs.est.2 <- att_gt(yname = "nat_rate_ord", gname = "FirstTreat", idname = "bfs", tname = "year", xformla = ~1, control_group = "notyettreated", allow_unbalanced_panel = TRUE, data = df.cs, est_method = "reg") cs.est.att.2 <- aggte(cs.est.2, type = "simple",na.rm=T, bstrap = F) print(cs.est.att.2)
We can calculate the dynamic treatment effects using the same method, and then use esplot
to make the event study plot. Note that we set the reference period to be the last pre-treatment period.
cs.est.2.u <- att_gt(yname = "nat_rate_ord", gname = "FirstTreat", idname = "bfs", tname = "year", xformla = ~1, control_group = "notyettreated", allow_unbalanced_panel = TRUE, data = df.cs, est_method = "reg", base_period = "universal") cs.att.2.u <- aggte(cs.est.2.u, type = "dynamic", bstrap=FALSE, cband=FALSE, na.rm=T) # plot cs.output.u <- cbind.data.frame(Estimate = cs.att.2.u$att.egt, SE = cs.att.2.u$se.egt, time = cs.att.2.u$egt + 1) p.cs.2.u <- esplot(cs.output.u,Period = 'time',Estimate = 'Estimate', SE = 'SE', xlim = c(-12,10)) p.cs.2.u
@de2020two and @de2024difference introduce a method for multi-group, multi-period designs that accommodate a potentially non-binary treatment, which can change (increase or decrease) multiple times. For details, see @de2024event. Here, we demonstrate only the basic usage for a binary, staggered treatment.
When calling did_multiplegt_dyn
, specify the outcome, treatment, unit index (group
), and time index (time
). The effects
option determines how many event-study coefficients to estimate. For each period $l$, the event-study effect represents the average change in the outcome for all switchers after receiving the treatment for $l$ periods, compared to remaining at their initial treatment level. The placebo
option specifies how many placebo coefficients to compute. For each placebo period $k$, the command compares the outcome evolution of switchers and their controls $k$ periods before the switchers' first treatment change.
See a simple illustration below with $l = 2$ (number of lags, capturing post-treatment effects) and $k = 2$ (number of leads, capturing pre-treatment placebo effects).
knitr::include_graphics("fig/fig_pm.png")
In the following example, we set $l = 12$ and $k = 9$.
didm.results <- did_multiplegt_dyn( df = df.use, outcome = "nat_rate_ord", group = "bfs", controls = NULL, time = "year", treatment = "indirect", effects = 12, placebo = 9, cluster = "bfs", graph_off = TRUE ) print(didm.results)
Again, we make an event study plot using esplot
.
T.post <- dim(didm.results$results$Effects)[1] T.pre <- dim(didm.results$results$Placebos)[1] didm.vis <- rbind(didm.results$results$Placebos,didm.results$results$Effects) didm.vis <- as.data.frame(didm.vis) didm.vis[,'Time'] <- c(c(-1:-(T.pre)),c(1:T.post)) est.dynamic <- didm.vis[,c(9,1,2,3,4)] colnames(est.dynamic) <- c("T","estimate","se","lb","ub") p.didm <- esplot(est.dynamic,Period = 'T',Estimate = 'estimate', SE = 'se', xlim = c(-12, 9)) p.didm
@IKW2023 propose a matching method, similar in spirit to DIDm, for panel data analysis. It allows researchers to match each treated observation for a given unit in a particular period with untreated observations from other units in the same period that have similar treatment, outcome, or covariate histories. Their package is called PanelMatch.
The PanelMatch
function in PanelMatch creates matched sets. The option lag = 3
specifies that pre-treatment history should span three periods, while lead = c(0:3)
sets the lead window to include four post-treatment periods. Note that the terms lead and lag here are opposite to their traditional usage in a dynamic TWFE specification, where lags correspond to post-treatment effects and leads correspond to pre-treatment placebo effects.
To assign equal weights to all control units in each matched set, we set refinement.method = "none"
(without matching on controls). Notably, by matching on treatment history and defining the lead window, PanelMatch
uses only the subset of treated units with three pre-treatment periods and four post-treatment periods to compute the average treatment effects.
The PanelMatch estimator is equivalent to the DIDmulitple estimator proposed by @CDH2020 if we only match on the last pre-treatment period and only estimate the treatment effect at the first post-treatment period.
df.pm <- df.use # we need to convert the unit and time indicator to integer df.pm[,"bfs"] <- as.integer(as.factor(df.pm[,"bfs"])) df.pm[,"year"] <- as.integer(as.factor(df.pm[,"year"])) df.pm <- df.pm[,c("bfs","year","nat_rate_ord","indirect")] # Pre-processes and balances panel data df.pm <- PanelData(panel.data = df.pm, unit.id = "bfs", time.id = "year", treatment = "indirect", outcome = "nat_rate_ord") PM.results <- PanelMatch(lag=3, refinement.method = "none", panel.data = df.pm, qoi = "att", lead = c(0:3), match.missing = TRUE) ## For pre-treatment dynamic effects PM.results.placebo <- PanelMatch(lag=3, refinement.method = "none", panel.data = df.pm, qoi = "att", lead = c(0:3), match.missing = TRUE, placebo.test = TRUE)
We can estimate the ATT and dynamic treatment effects using the function PanelEstimate
. To obtain the ATT, we set the option pooled = TRUE
. The standard error is calculated using a block bootstrapping method. This is different from the standard boostrapping method, which is not valid for matching estimators \cite{abadie2008failure}.
# ATT PE.results.pool <- PanelEstimate(PM.results, panel.data = df.pm, pooled = TRUE) summary(PE.results.pool)
We can also use the function PanelEstimate
to estimate the dynamic treatment effects at post-treatment periods. Additionally, for pre-treatment periods, PanelMatch offers the function placebo_test
to compute the change in outcomes for each pre-treatment period in the lag window compared to the last pre-treatment period.
# Dynamic Treatment Effects PE.results <- PanelEstimate(PM.results, panel.data = df.pm) PE.results.placebo <- placebo_test(PM.results.placebo, panel.data = df.pm, plot = F) # obtain lead and lag (placebo) estimates est_lead <- as.vector(PE.results$estimate) est_lag <- as.vector(PE.results.placebo$estimates) sd_lead <- apply(PE.results$bootstrapped.estimates,2,sd) sd_lag <- apply(PE.results.placebo$bootstrapped.estimates,2,sd) coef <- c(est_lag, 0, est_lead) sd <- c(sd_lag, 0, sd_lead) pm.output <- cbind.data.frame(ATT=coef, se=sd, t=c(-2:4)) # plot p.pm <- esplot(data = pm.output,Period = 't', Estimate = 'ATT',SE = 'se') p.pm
Now, we return to the imputation method, which fect is originally designed to implement. See a simple illustration below. For details, see @sec-fect.
knitr::include_graphics("fig/fig_fect.png")
For a fair comparison, we implement this estimator—the fixed effects counterfactual (fect) estimator—based on a TWFE model. The estimated ATT is 1.506, with a standard error of 0.197.
out.fect <- fect(nat_rate_ord~indirect, data = df, index = c("bfs","year"), method = 'fe', se = TRUE) print(out.fect$est.avg)
The fect
function stores the estimated dynamic treatment effects obtained using the counterfactual method in the object est.att
. Uncertainty estimates are derived through a nonparametric clustered bootstrap/jackknife procedure.
fect.output <- as.matrix(out.fect$est.att) head(fect.output)
@BJS2024 also provide the didimputation package to estimate the ATT using the same approach as @LWX2022.
df.impute <- df.use df.impute[which(is.na(df.impute$FirstTreat)),"FirstTreat"] <- 0 # above, replace NA with 0 out.impute <- did_imputation(data = df.impute, yname = "nat_rate_ord", gname = "FirstTreat", tname = "year", idname = "bfs", cluster_var = "bfs") out.impute
We can use esplot()
to create the event study plot as before. Additionally, the fect package offers more advanced visualization tools specifically designed for imputation estimators (see the previous chapters and the next chapter).
fect.output <- as.data.frame(fect.output) fect.output$Time <- c(-17:18) p.fect <- esplot(fect.output,Period = 'Time',Estimate = 'ATT', SE = 'S.E.',CI.lower = "CI.lower", CI.upper = 'CI.upper',xlim = c(-12,10)) p.fect
The didimputation package also supports event study estimation, producing numerically equivalent estimates for dynamic treatment effects in the post-treatment period. However, didimputation
uses a distinct TWFE regression applied only to untreated observations to obtain pre-treatment estimates and test for parallel trends. In this case, we specify pretrends = c(-13:-1)
to compute regression coefficients for pre-treatment periods ranging from -13 to -1.
model.impute <- did_imputation(data = df.impute, yname = "nat_rate_ord", gname = "FirstTreat", tname = "year", idname = "bfs", cluster_var = "bfs", pretrends = c(-13:-1), horizon = TRUE) model.impute$term <- as.numeric(model.impute$term)+1 # above, set 1 as the first post-treatment period # plot to_plot <- as.data.frame(model.impute) esplot(data=to_plot,Period = "term", Estimate = 'estimate', SE = 'std.error', xlim = c(-12,10)) out.impute
To compare the results with those obtained using the PanelMatch method, fect provides the option to compute the ATT for the subset of units with three pre-treatment periods and four post-treatment periods. This can be done by setting balance.period = c(-2,4)
, which includes periods -2, -1, and 0 for pre-treatment and 1, 2, 3, and 4 for post-treatment.
out.fect.balance <- fect(nat_rate_ord~indirect, data = df, index = c("bfs","year"), method = 'fe', se = TRUE, balance.period = c(-2,4)) # att print(out.fect.balance$est.balance.avg) # event study plot fect.balance.output <- as.data.frame(out.fect.balance$est.balance.att) fect.balance.output$Time <- c(-2:4) p.fect.balance <- esplot(fect.balance.output,Period = 'Time', Estimate = 'ATT', SE = 'S.E.', CI.lower = "CI.lower", CI.upper = 'CI.upper') p.fect.balance
We will come back to this empirical example when we implement the sensitivity analysis in [Chapter @sec-panel-sens].
We use data from @GS2020 to demonstrate methods that can accommodate treatment reversals, where treatment can switch on and off. These methods include TWFE, PanelMatch, and the imputation estimator.
In the original paper, the authors use district by election-cycle panel data from U.S. House general elections between 1980 and 2012, arguing that the presence of Asian (Black/Latino) candidates in general elections increases the proportion of campaign contributions from Asian (Black/Latino) donors. Here, we focus specifically on the effects of Asian candidates, as shown in the top left panel of Figure 5 in the paper.
data(fect) data <- gs2020 data$cycle <- as.integer(as.numeric(data$cycle/2)) head(data)
In our analysis, district_final
and cycle
serve as the unit and time indices, respectively. The treatment variable, cand_A_all
, indicates the presence of an Asian candidate, while the outcome variable, general_sharetotal_A_all
, represents the proportion of donations from Asian donors. The covariates included are cand_H_all
, indicating a Hispanic candidate, and cand_B_all
, indicating a Black candidate.
As usual, we use the panelView package to examine treatment conditions and missing values in the data. We set by.timing = TRUE
to organize units by treatment timing. Below, we can see that this dataset has missing values in the outcome variable, and the treatment may switch on and off.
y <- "general_sharetotal_A_all" d <- "cand_A_all" unit <- "district_final" time <- "cycle" controls <- c("cand_H_all", "cand_B_all") index <- c("district_final", "cycle") panelview(Y=y, D=d, X=controls, index = index, data = data, xlab = "Time Period", ylab = "Unit", gridOff = TRUE, by.timing = TRUE, cex.legend=5, cex.axis= 5, cex.main = 10, cex.lab = 5)
First, we estimate the effect of an Asian candidate on the proportion of Asian donations using the TWFE estimator. The estimated coefficient is 0.128, with a standard error of 0.025, clustered at the district level.
model.twfe <- feols(general_sharetotal_A_all ~ cand_A_all + cand_H_all + cand_B_all | district_final + cycle, data=data, cluster = "district_final") summary(model.twfe)
To estimate dynamic treatment effects using TWFE, we first use the get.cohort
function to determine the relative period to treatment. Next, we generate the treated unit indicator treat
using the following steps:
treat = 0
for observations of units that have never been treated.\treat = 1
only for observations before the last treatment exit. For example, if a unit's treatment status sequence is 0, 0, 0, 1, 1, 0, 0
, we set treat = 1
only for the first five observations. This approach assumes no carryover effects, meaning treatment affects only the current period.\treat
with the relative period index in a two-way fixed effects regression to estimate dynamic treatment effects.data_cohort <- get.cohort(data, index = index, D=d,start0 = TRUE) # Generate a dummy variable treat data_cohort$treat <- 0 data_cohort[which(data_cohort$Cohort!='Control'),'treat'] <- 1 data_cohort[which(is.na(data_cohort$Time_to_Treatment)), "treat"] <- 0 # remove observations that starts with treated status remove <- intersect(which(is.na(data_cohort$Time_to_Treatment)), which(data_cohort[,d]==1)) if(length(remove)>0){data_cohort <- data_cohort[-remove,]} # replace missingness in Time_to_Treatment with an arbitrary number data_cohort[which(is.na(data_cohort$Time_to_Treatment)), "Time_to_Treatment"] <- 999 twfe.est <- feols(general_sharetotal_A_all ~ i(Time_to_Treatment, treat, ref = -1) + cand_H_all +cand_B_all | district_final + cycle, data = data_cohort, cluster = "district_final")
We then visualize the estimated dynamic treatment.
twfe.output <- as.data.frame(twfe.est$coeftable[c(1:25),]) twfe.output$Time <- c(c(-16:-2),c(0:9)) + 1 # plot p.twfe <- esplot(twfe.output,Period = 'Time',Estimate = 'Estimate', SE = 'Std. Error', xlim = c(-15,1)) p.twfe
PanelMatch can handle treatment reversals but uses only a subset of the data based on the specified numbers of leads and lags. DIDmultiple, proposed by @CDH2020, can also accommodate treatment reversals. However, its current implementation in R, DIDmultiplegtDYN, focuses only on the first switch from the control condition to the treatment condition for each unit. Therefore, we use PanelMatch to illustrate how matching methods work in this more general setting.
In this case, we use a subset of treated units with four pre-treatment periods and one post-treatment period to estimate the average treatment effects.
df.pm <- data_cohort # we need to convert the unit and time indicator to integer df.pm[,"district_final"] <- as.integer(as.factor(df.pm[,"district_final"])) df.pm[,"cycle"] <- as.integer(as.factor(df.pm[,"cycle"])) df.pm <- df.pm[,c("district_final","cycle","cand_A_all", "general_sharetotal_A_all")] # Pre-processes and balances panel data df.pm <- PanelData(panel.data = df.pm, unit.id = "district_final", time.id = "cycle", treatment = "cand_A_all", outcome = "general_sharetotal_A_all") PM.results <- PanelMatch(lag=4, refinement.method = "none", panel.data = df.pm, qoi = "att", lead = 0, match.missing = TRUE) ## For pre-treatment dynamic effects PM.results.placebo <- PanelMatch(lag=4, refinement.method = "none", panel.data = df.pm, qoi = "att", lead = 0, match.missing = TRUE, placebo.test = TRUE)
The ATT estimate is 0.124, with a standard error of 0.023.
PE.results.pool <- PanelEstimate(PM.results, panel.data = df.pm, pooled = TRUE) summary(PE.results.pool)
We obtain the dynamic treatment effects using the same approach as demonstrated in the case of @HH2019 with the esplot
function.
# Dynamic Treatment Effects PE.results <- PanelEstimate(PM.results, panel.data = df.pm) PE.results.placebo <- placebo_test(PM.results.placebo, panel.data = df.pm, plot = FALSE) est_lead <- as.vector(PE.results$estimate) est_lag <- as.vector(PE.results.placebo$estimates) sd_lead <- apply(PE.results$bootstrapped.estimates,2,sd) sd_lag <- apply(PE.results.placebo$bootstrapped.estimates,2,sd) coef <- c(est_lag, 0, est_lead) sd <- c(sd_lag, 0, sd_lead) pm.output <- cbind.data.frame(ATT=coef, se=sd, t=c(-3:1)) # plot p.pm <- esplot(data = pm.output,Period = 't', Estimate = 'ATT',SE = 'se') p.pm
Now we return to the imputation method proposed by @BJS2024 and @LWX2022. The estimated ATT is 0.127, with a standard error of 0.025. Both are very close to the TWFE estimates.
model.fect <- fect(Y = "general_sharetotal_A_all", D = "cand_A_all", X= c("cand_H_all", "cand_B_all"), data = data, method = "fe", index = index, se = TRUE, parallel = TRUE, seed = 1234, force = "two-way") print(model.fect$est.avg)
We visualize the estimated dynamic treatment for the counterfactual estimator in the same way.
fect.output <- as.data.frame(model.fect$est.att) fect.output$Time <- c(-15:10) p.fect <- esplot(fect.output,Period = 'Time',Estimate = 'ATT', SE = 'S.E.',CI.lower = "CI.lower", CI.upper = 'CI.upper', xlim = c(-15,1)) p.fect
The plot function shipped with fect can make the event study plot easily with a bar plot in the bottom indicating the number of treated units for each period.
plot(model.fect)
We can visualize the period-wise ATT relative to the exit of the treatment by setting type = "exit"
.
plot(model.fect, type = 'exit')
As discussed in @sec-fect, fect provides various diagnostic tests, including the $F$ test, placebo test, equivalence tests, and test for no-carryover effects. The figure below presents the results of a placebo test, where we set placebo.period = c(-2, 0)
, specifying that the placebo periods include the three periods preceding treatment onset.
out.fect.p <- fect(Y = y, X = controls, D = d, data = data, index = index, method = 'fe', se = TRUE, placeboTest = TRUE, placebo.period = c(-2,0)) plot(out.fect.p, proportion = 0.1, stats = "placebo.p")
We then test for carryover effects by setting carryoverTest = TRUE
and carryover.period = c(1,2)
, which removes observations from periods after treatment has ended from the data used to fit the model. The test then assesses whether the estimated ATT in this range is significantly different from zero.
out.fect.c <- fect(Y = y, X = controls, D = d, data = data, index = index, method = 'fe', se = TRUE, carryoverTest = TRUE, carryover.period = c(1,2)) # plot plot(out.fect.c, stats = "carryover.p", ylim = c(-0.15, 0.20))
To compare with the estimates from PanelMatch, we can construct a balanced sample by setting balance.period = c(-3,1))
, ensuring that both methods use the same set of treated observations. The ATT estimate is similar but slightly larger than that obtained using the full sample.
out.fect.balance <- fect(Y = y, X = controls, D = d, data = data, index = index, method = 'fe', se = TRUE, balance.period = c(-3,1)) # att print(out.fect.balance$est.balance.avg) # event study plot fect.balance.output <- as.data.frame(out.fect.balance$est.balance.att) fect.balance.output$Time <- c(-3:1) p.fect.balance <- esplot(fect.balance.output,Period = 'Time',Estimate = 'ATT', SE = 'S.E.',CI.lower = "CI.lower", CI.upper = 'CI.upper') p.fect.balance
@rambachan2023more propose a partial identification approach that relaxes the PT assumption in the post-treatment period by allowing violations that do not exceed the magnitude of those observed in the pre-treatment period. This framework enables sensitivity analysis of estimates from fect or similar methods by comparing pre-treatment deviations from PT to potential post-treatment deviations.
The key intuition is that if an event study demonstrates strong post-treatment effects yet only minor PT deviations before treatment, any post-treatment departure large enough to reverse these findings must be substantially larger than those observed in the pre-treatment period. Consequently, this approach quantifies how sensitive the estimated dynamic treatment effects are to possible PT violations, using pretrend estimates as the benchmark.
Below, we revisit the @HH2019 data and illustrate how to apply this sensitivity analysis with fect. We focus on two restrictions from @rambachan2023more: the relative magnitude (RM) restriction and the smoothness restriction, both of which connect pre-treatment PT violations to potential post-treatment counterfactual deviations.
To implement this method with the imputation estimator, we use the dynamic treatment effects from pre-treatment placebo tests to gauge PT violations and determine whether post-treatment effects remain significant under similar violations. This requires symmetric estimation of dynamic treatment effects in both pre- and post-treatment periods. As [@roth2024interpreting] notes, some estimators (e.g., CSDID without base_period = "universal"
) may not produce symmetrical estimates.
Below, we designate placebo periods using fect. These placebo periods are excluded during model fitting, and counterfactuals are imputed for both placebo and post-treatment intervals to compute dynamic treatment effects, ensuring consistent estimation across all periods.
By setting placeboTest = TRUE
and placebo.period = c(-2, 0)
, we define three pre-treatment periods as placebo periods. Their dynamic treatment effects serve as the benchmark for PT violations in the post-treatment phase. In the code chunk below, we fit fect with these placebo settings, then extract both placebo and post-treatment dynamic treatment effects and the associated variance-covariance matrix.
out.fect.placebo <- fect(nat_rate_ord~indirect, data = hh2019, index = c("bfs","year"), method = 'fe', se = TRUE, placeboTest = TRUE, placebo.period = c(-2,0)) T.post <- 10 # 3 placebo periods, 10 post treatment periods index.use <- which(rownames(out.fect.placebo$est.att) %in% as.character(c(-2:T.post))) beta.hat.p<- out.fect.placebo$est.att[index.use,1] vcov.hat.p <- out.fect.placebo$att.vcov[index.use,index.use] count <- out.fect.placebo$count[which(rownames(out.fect.placebo$est.att) %in% as.character(c(1:T.post)))]
We first explore the Relative Magnitude (RM) restriction in the HonestDiD package. Let $\delta$ represent potential PT violations for placebo and post-treatment periods. Unlike a standard event study that assumes $\delta_t=0$ for $t>0$, RM allows PT deviations as long as they do not exceed $\bar{M}$ times the maximum deviation between consecutive placebo periods.
Because HonestDiD is designed for event study estimators with a single reference period $\delta_0=0$ at $t=0$, this structure is not directly compatible with fect, which does not enforce a single reference period. We thus modify HonestDiD slightly to allow $\delta_0\neq 0$. The modified RM restriction becomes
$$ \Delta^{RM}(\bar{M}) = \left{\delta : \forall t \geq 0,\; \bigl|\delta_{t+1} - \delta_t\bigr| \leq \bar{M}\cdot\max\bigl( |\delta_{-1} - \delta_{-2}|,\,|\delta_0 - \delta_{-1}| \bigr)\right}. $$
Here, $max(|\delta_{-1}-\delta_{-2}|,|\delta_0-\delta_{-1}|)$ matches the largest consecutive discrepancy among the placebo periods. When $\bar{M}=0$, the PT violation observed at $t=0$ persists without change into the post-treatment window. Allowing $\bar{M}>0$ permits $\delta_t$ to vary across post-treatment periods, but the incremental changes must remain within $\bar{M}$ times the largest consecutive deviation among placebo periods.
We implement these adaptations in a forked version of HonestDiD, called HonestDiDFEct:
devtools::install_github("lzy318/HonestDiDFEct")
The code below illustrates how createSensitivityResults_relativeMagnitudes
from HonestDiDFEct produces robust confidence intervals for the ATT using the RM restriction.
Robust Confidence Set for the ATT
We begin by constructing a robust confidence set for the ATT, accounting for the extra uncertainty introduced by potential PT violations. The vector betahat
comprises the DTE estimates for both placebo and post-treatment periods, and sigma
is their corresponding variance-covariance matrix. We specify numPrePeriods
and numPostPeriods
for the number of pre- and post-treatment periods, and l_vec
gives the weights applied to each post-treatment period for the ATT. In particular, we use count/sum(count)
.
We vary $\bar{M}$ from 0 to 1 using Mbarvec = seq(0, 1, by=0.1)
. Increasing $\bar{M}$ allows proportionally larger PT violations in the post-treatment window. When $\bar{M}=0$, the resulting confidence set behaves as a "de-biased" interval that corrects post-treatment estimates based on the observed PT violation at $t=0$.
honest.result.p <- HonestDiDFEct::createSensitivityResults_relativeMagnitudes(betahat = beta.hat.p, sigma = vcov.hat.p, numPrePeriods = 3, numPostPeriods = T.post, l_vec = count/sum(count), Mbarvec = seq(0,1,by=0.1)) originalResults.p <- HonestDiDFEct::constructOriginalCS(betahat = beta.hat.p, sigma = vcov.hat.p, numPrePeriods = 3, numPostPeriods = T.post, l_vec = count/sum(count)) HonestDiDFEct::createSensitivityPlot_relativeMagnitudes(honest.result.p, originalResults.p)
If the robust confidence set excludes zero at $\bar{M}=0.4$ but includes zero at $\bar{M}=0.5$, we infer that post-treatment PT violations must be at least half of the maximum observed placebo violation to overturn the estimated effect.
Period-by-Period Robust Confidence Set
By altering l_vec
, one can generate period-specific robust confidence intervals. In the next code snippet, we set $\bar{M}=0$ and $\bar{M}=0.5$ to compute separate confidence intervals for each post-treatment period. In the final figure, the intervals for $\bar{M}=0$ (green) treat the observed violation at $t=0$ as persisting into all post-treatment periods, whereas the intervals for $\bar{M}=0.5$ (pink) allow added PT violations up to half of the largest placebo discrepancy.
dte_base <- rep(0,T.post) dte_output <- cbind.data.frame(lb = c(), ub = c(), method = c(), Delta = c(), Mbar = c(), post = c()) for(t.post in c(1:T.post)){ dte_l <- dte_base dte_l[t.post] <- 1 honest.dte <- HonestDiDFEct::createSensitivityResults_relativeMagnitudes(betahat = beta.hat.p, sigma = vcov.hat.p, numPrePeriods = 3, numPostPeriods = T.post, l_vec = dte_l, Mbarvec = c(0,0.5)) honest.dte <- as.data.frame(honest.dte) honest.dte$post <- t.post dte_output <- rbind(dte_output,honest.dte) }
# event study plot fect.output.p <- as.data.frame(out.fect.placebo$est.att) fect.output.p$Time <- as.numeric(rownames(fect.output.p)) p.placebo.honest <- esplot(fect.output.p,Period = 'Time',Estimate = 'ATT', SE = 'S.E.',CI.lower = "CI.lower", main = "ATTs with Robust Confidence Sets (RM)", ylab = "Coefficients and 95% CI", CI.upper = "CI.upper", xlim = c(-12,10), ylim = c(-6,8), Count = 'count', show.count = T) dte_output <- as.data.frame(dte_output) # add confidence sets p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=post+0.2,ymin=lb,ymax=ub), data = dte_output[which(dte_output$Mbar==0.5),], color = 'maroon1',linewidth = 1) # estimates assuming no PT violation p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=post+0.2,ymin=lb,ymax=ub), data = dte_output[which(dte_output$Mbar==0),], color = '#228B22',linewidth = 1) # placebo estimates p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=Time,ymin=CI.lower,ymax=CI.upper), data = fect.output.p[which(fect.output.p$Time%in%c(-2:0)),], color = 'blue',linewidth = 1) p.placebo.honest
A second approach to bounding PT violations is the smoothness restriction, which prevents the post-treatment violation from diverging too sharply from a linear extrapolation of the pre-trend. This restriction is particularly relevant if we suspect a gradually varying or near-linear trend in the potential violation.
Formally, one assumes $\delta\in\Delta^{SD}(M)$ where
$$ \Delta^{SD}(M) := \bigl{\delta : \bigl| (\delta_{t+1}-\delta_t) - (\delta_t -\delta_{t-1}) \bigr| \leqslant M,\ \forall t\bigr}. $$
The parameter $M\geq 0$ controls how quickly the slope of $\delta$ can vary. Note that $M=0$ restricts $\delta$ to be strictly linear, which does not imply $\delta=0$.
Below, we demonstrate how to use createSensitivityResults
from HonestDiDFEct, setting Mvec
to explore different smoothness levels:
honest.result.p <- HonestDiDFEct::createSensitivityResults(betahat = beta.hat.p, sigma = vcov.hat.p, numPrePeriods = 3, numPostPeriods = T.post, method = 'C-LF', l_vec = count/sum(count), Mvec = seq(0,0.25,0.05)) originalResults.p <- HonestDiDFEct::constructOriginalCS(betahat = beta.hat.p, sigma = vcov.hat.p, numPrePeriods = 3, numPostPeriods = T.post, l_vec = count/sum(count)) HonestDiD::createSensitivityPlot(honest.result.p, originalResults.p)
Here, the post-treatment effect is indistinguishable from zero even when $M=0$, suggesting that the estimated average treatment effect may not be robust to a strictly linear violation of PT observed in the placebo periods, let alone to variations in the slope of this linear trend.
Period-by-Period Robust Confidence Set
Finally, we replicate the previous approach to construct period-specific robust confidence sets under smoothness. In the code below, the black intervals represent original estimates under no PT violation, the orange intervals assume a strictly linear PT violation $M=0$, and the light blue intervals allow a slope deviation of 0.1:
dte_base <- rep(0,T.post) dte_output <- cbind.data.frame(lb = c(), ub = c(), method = c(), Delta = c(), Mbar = c(), post = c()) for(t.post in c(1:T.post)){ dte_l <- dte_base dte_l[t.post] <- 1 honest.dte <- HonestDiDFEct::createSensitivityResults(betahat = beta.hat.p, sigma = vcov.hat.p,method = 'C-LF', numPrePeriods = 3, numPostPeriods = T.post, l_vec = dte_l, Mvec = c(0,0.1)) honest.dte <- as.data.frame(honest.dte) honest.dte$post <- t.post dte_output <- rbind(dte_output,honest.dte) }
# prepare for plotting fect.output.p <- as.data.frame(out.fect.placebo$est.att) fect.output.p$Time <- as.numeric(rownames(fect.output.p)) p.placebo.honest <- esplot(fect.output.p,Period = 'Time',Estimate = 'ATT', SE = 'S.E.',CI.lower = "CI.lower", main = "ATTs with Robust Confidence Sets (Smoothness)", ylab = "Coefficients and 95% CI", CI.upper = "CI.upper",xlim = c(-12,10), ylim = c(-12,15),Count = 'count',show.count = T) dte_output <- as.data.frame(dte_output) # estimates assuming potential PT violation w/ a linear trend + partial deviation p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=post+0.2,ymin=lb,ymax=ub), data = dte_output[which(dte_output$M==0.1),], color = 'skyblue',linewidth = 1) # estimates assuming a strictly linear PT violation p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=post+0.2,ymin=lb,ymax=ub), data = dte_output[which(dte_output$M==0),], color = 'orange',linewidth = 1) # estimates assuming no PT violation p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=Time,ymin=CI.lower,ymax=CI.upper), data = fect.output.p[which(fect.output.p$Time>0),], color = 'black',linewidth = 1) # placebo estimates p.placebo.honest <- p.placebo.honest + geom_linerange(aes(x=Time,ymin=CI.lower,ymax=CI.upper), data = fect.output.p[which(fect.output.p$Time%in%c(-2:0)),], color = 'blue',linewidth = 1) + xlim(-12.5,10.5) p.placebo.honest
Please cite the authors of the original papers for their innovations. If you find this tutorial helpful, you can cite @CLLX2025.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.