| sdid | R Documentation |
Fits a linear staggered difference-in-differences model, following the Abraham and Sun (2018) approach. It facilitates optional weighting and user-specified variance-covariance function.
sdid(
formula,
df,
weights = NULL,
cohort_var = NULL,
cohort_ref = NULL,
cohort_time_refs = NULL,
time_var = NULL,
time_ref = NULL,
intervention_var,
.vcov = stats::vcov,
...
)
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under 'Details'. |
df |
A data frame containing the variables in the model. |
weights |
An optional vector of weights to be passed to |
cohort_var |
Name of the variable in |
cohort_ref |
Value of |
cohort_time_refs |
A list, whose elements are named to match levels of
|
time_var |
Name of the variable in |
time_ref |
Value of |
intervention_var |
Name of the cohort-level variable in |
.vcov |
Function to be used to estimate the variance-covariance matrix. Defaults to stats::vcov. |
... |
Additional arguments to be passed to |
Fitting a staggered difference-in-differences model requires deliberate attention to two specific independent variables:
The intervention cohort column assigns a cohort name to all individuals or groups having the the intervention during the same time period. For example, if the longitudinal data is at the year level, ranging from 2010 to 2020, and it contains 15 counties, 3 of whom implemented the intervention of interest in 2015, those 3 counties would be assigned to the same cohort. Similarly, if 2 more counties implemented the intervention in 2016, those 2 counties would be assigned to the next cohort.
The time period column assigns each observation to a time period at the most granular level of the longitudinal data. In the example described above, these values would correspond to the years 2010, ..., 2020.
To specify a model, a formula is passed following the format response ~ cohort_var + time_var + covariates. This, however, is not the formula use to fit the model; sdid() expands this formula to include main effects and every possible interaction between cohort_var and time_var, excluding referents for identification:
Referents for main effects are either the first levels cohort_var and time_var or the referents specified in cohort_ref and time_ref.
Referents for cohort-time interactions are either the factor level of time_var that immediately precedes the value of intervention_var within each cohort or the referents specified in cohort_time_refs.
sdid() also accommodates aggregated data through the weights argument.
Returns an object of class sdid, which is a list containing the
following components:
mdl
: The lm object returned from the call to stats::lm() in sdid()
formula
: A list object containing both the original formula specified in the call to sdid() and the generated formula, with all cohort-time interactions, passed to stats::lm() to fit the model
vcov : The variance-covariance matrix used to estimate standard errors
tsi : The time-since-intervention dataset used to enumerate time periods relative to the intervention period for each cohort
obs_cnt
: Counts of observations within each cohort-time interaction
cohort
: A list object containing details about cohorts. var contains the name of the column in df that identifies cohorts; ref contains the value of the cohort column that functions as the referent for main effects; and time_refs contains the referent time values within each cohort for each set of cohort-time interactions.
time
: A list object containing var, which is the name of the column in df identified by the sdid() argument time_var, and ref, the referent value of time_var for main effects.
intervention_var
: Name of the column in df that contains the time period during which each cohort implemented the intervention of interest
covariates
: A character vector containing the terms in formula other than those corresponding to cohorts and time periods
Abraham S, Sun L. Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects. MIT; 2018.
# Fit a staggered difference-in-differences model
sdid_hosp <- sdid(hospitalized ~ cohort + yr + age + sex + comorb,
df = hosp,
intervention_var = "intervention_yr")
summary(sdid_hosp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.