didDML | R Documentation |
This function estimates the average treatment effect on the treated (ATET) in the post-treatment period for a binary treatment using a doubly robust Difference-in-Differences (DiD) approach for repeated cross-sections that is combined with double machine learning. It controls for (possibly time-varying) confounders in a data-driven manner and supports various machine learning methods for estimating nuisance parameters through k-fold cross-fitting.
didDML(
y,
d,
t,
x,
MLmethod = "lasso",
est = "dr",
trim = 0.05,
cluster = NULL,
k = 3
)
y |
Outcome variable. Should not contain missing values. |
d |
Treatment group indicator (binary). Should not contain missing values. |
t |
Time period indicator (binary). Should be 1 for post-treatment period and 0 for pre-treatment period. Should not contain missing values. |
x |
Covariates to be controlled for. Should not contain missing values. |
MLmethod |
Machine learning method for estimating nuisance parameters using the |
est |
Estimation method. Must be one of |
trim |
Trimming threshold (in percentage) for discarding observations with too small propensity scores within any subgroup defined by the treatment group and time. Default is 0.05. |
cluster |
Optional clustering variable for calculating cluster-robust standard errors. |
k |
Number of folds in k-fold cross-fitting. Default is 3. |
This function estimates the Average Treatment Effect on the Treated (ATET) in the post-treatment period based on Difference-in-Differences in repeated cross-sections when controlling for confounders in a data-adaptive manner using double machine learning. The function supports different machine learning methods to estimate nuisance parameters (conditional mean outcomes and propensity scores) as well as cross-fitting to mitigate overfitting. Besides double machine learning, the function also provides inverse probability weighting and regression adjustment methods (which are, however, not doubly robust).
A list with the following components:
ATET
: Estimate of the Average Treatment Effect on the Treated (ATET) in the post-treatment period.
se
: Standard error of the ATET estimate.
pval
: P-value of the ATET estimate.
trimmed
: Number of discarded (trimmed) observations.
pscores
: Propensity scores of untrimmed observations (4 columns): under treatment in period 1, under treatment in period 0, under control in period 1, under control in period 0.
outcomepred
: Conditional outcome predictions of untrimmed observations (3 columns): in treatment group in period 0, in control group in period 1, in control group in period 0.
treat
: Treatment status of untrimmed observations.
time
: Time period of untrimmed observations.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., Robins, J. (2018): "Double/debiased machine learning for treatment and structural parameters", The Econometrics Journal, 21, C1-C68.
Zimmert, M. (2020): "Efficient difference-in-differences estimation with high-dimensional common trend confounding", arXiv preprint 1809.01643.
## Not run:
# Example with simulated data
n=4000 # sample size
t=1*(rnorm(n)>0) # time period
u=runif(n,0,1) # time constant unobservable
x= 0.25*t+runif(n,0,1) # time varying covariate
d=1*(x+u+2*rnorm(n)>0) # treatment
y=d*t+t+x+u+2*rnorm(n) # outcome
# true effect is equal to 1
results=didDML(y=y, d=d, t=t, x=x)
cat("ATET: ", round(results$ATET, 3), ", Standard error: ", round(results$se, 3))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.