didcontDML | R Documentation |
This function estimates the average treatment effect on the treated of a continuously distributed treatment in repeated cross-sections based on a Difference-in-Differences (DiD) approach using double machine learning to control for time-varying confounders in a data-driven manner. It supports estimation under various machine learning methods and uses k-fold cross-fitting.
didcontDML(
y,
d,
t,
dtreat,
dcontrol,
t0 = 0,
t1 = 1,
controls,
MLmethod = "lasso",
psmethod = 1,
trim = 0.1,
lognorm = FALSE,
bw = NULL,
bwfactor = 0.7,
cluster = NULL,
k = 3
)
y |
Outcome variable. Should not contain missing values. |
d |
Treatment variable in the treatment period of interest. Should be continuous and not contain missing values. |
t |
Time variable indicating outcome periods. Should not contain missing values. |
dtreat |
Value of the treatment under treatment (in the treatment period of interest). This value would be 1 for binary treatments. |
dcontrol |
Value of the treatment under control (in the treatment period of interest). This value would be 0 for binary treatments. |
t0 |
Value indicating the pre-treatment outcome period. Default is 0. |
t1 |
Value indicating the post-treatment outcome period in which the effect is evaluated. Default is 1. |
controls |
Covariates and/or previous treatment history to be controlled for. Should not contain missing values. |
MLmethod |
Machine learning method for estimating nuisance parameters using the |
psmethod |
Method for computing generalized propensity scores. Set to 1 for estimating conditional treatment densities using the treatment as dependent variable, or 2 for using the treatment kernel weights as dependent variable. Default is 1. |
trim |
Trimming threshold (in percentage) for discarding observations with too much influence within any subgroup defined by the treatment group and time. Default is 0.1. |
lognorm |
Logical indicating if log-normal transformation should be applied when estimating conditional treatment densities using the treatment as dependent variable. Default is FALSE. |
bw |
Bandwidth for kernel density estimation. Default is NULL, implying that the bandwidth is calculated based on the rule-of-thumb. |
bwfactor |
Factor by which the bandwidth is multiplied. Default is 0.7 (undersmoothing). |
cluster |
Optional clustering variable for calculating standard errors. |
k |
Number of folds in k-fold cross-fitting. Default is 3. |
This function estimates the Average Treatment Effect on the Treated (ATET) by Difference-in-Differences in repeated cross-sections while controlling for confounders using double machine learning. The function supports different machine learning methods for estimating nuisance parameters and performs k-fold cross-fitting to improve estimation accuracy. The function also handles binary and continuous outcomes, and provides options for trimming and bandwidth adjustments in kernel density estimation.
A list with the following components:
ATET
: Estimate of the Average Treatment Effect on the Treated.
se
: Standard error of the ATET estimate.
trimmed
: Number of discarded (trimmed) observations.
pval
: P-value.
pscores
: Propensity scores of untrimmed observations (4 columns): under treatment in period t1, under treatment in period t0, under control in period t1, under control in period t0.
outcomes
: Conditional outcomes of untrimmed observations (3 columns): in treatment group in period t0, in control group in period t1, in control group in period t0.
treat
: Treatment status of untrimmed observations.
time
: Time period of untrimmed observations.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., Robins, J. (2018): "Double/debiased machine learning for treatment and structural parameters", The Econometrics Journal, 21, C1-C68.
Haddad, M., Huber, M., Medina-Reyes, J., Zhang, L. (2024): "Difference-in-Differences with Time-varying Continuous Treatments using Double/Debiased Machine Learning", working paper, University of Fribourg.
## Not run:
# Example with simulated data
n=2000
t=rep(c(0, 1), each=n/2)
x=0.5*rnorm(n)
u=runif(n,0,2)
d=x+u+rnorm(n)
y=(2*d+x)*t+u+rnorm(n)
# true effect is 2
results=didcontDML(y=y, d=d, t=t, dtreat=1, dcontrol=0, controls=x, MLmethod="lasso")
cat("ATET: ", round(results$ATET, 3), ", Standard error: ", round(results$se, 3))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.