doubleML: double ML
In TaddyLab/gamlr: Gamma Lasso Regression

View source: R/doubleML.R

doubleML

R Documentation

double ML

Description

double (i.e., double) Machine Learning for treatment effect estimation

Usage

doubleML(x, d, y, nfold=2, foldid=NULL, family="gaussian", cl=NULL, ...)

Arguments

`x`	Covariates; see `gamlr`.
`d`	The matrix of treatment variables. Each column is used as a response by `gamlr` during the residualization procedure.
`y`	Response; see `gamlr`.
`nfold`	The number of cross validation folds.
`foldid`	An optional length-n vector of fold memberships for each observation. If specified, this dictates `nfold`.
`family`	Response model type for the treatment prediction; either "gaussian", "poisson", or "binomial". This can be either be a single family shared by all columns of `d` or a vector of families of length `ncol(d)`
`cl`	possible `parallel` library cluster. If this is not-`NULL`, the CV folds are executed in parallel. This copies the data `nfold` times, so make sure you have the memory space.
`...`	Arguments to all the `gamlr` regressions.

Details

Performs the double ML procedure of Chernozhukov et al. (2017) to produce an unbiased estimate of the average linear treatment effects of d on y. This procedure uses gamlr to regress y and each column of d onto x. In the cross-fitting routine described in Taddy (2019), these regressions are trained on a portion of the data and the out-of-sample residuals are calculated on the left-out fold. Model selection for these residualization steps is based on the AICc selection rule. The response residuals are then regressed onto the treatment residuals using lm and the resulting estimates and standard errors are unbiased for the treatment effects under the assumptions of Chernozhukov et al.

Value

A fitted lm object estimating the treatment effect of d on y. The lm function has been called with x=TRUE, y=TRUE such that this object contains the residualized d as x and residualized y as y.

Author(s)

Matt Taddy mataddy@gmail.com

References

Chernozhukov, Victor and Chetverikov, Denis and Demirer, Mert and Duflo, Esther and Hansen, Christian and Newey, Whitney and Robins, James (The Econometrics Journal, 2017), Double/debiased machine learning for treatment and structural parameters

Matt Taddy, 2019. Business Data Science, McGraw-Hill

Examples


data(hockey)
who <- which(colnames(player)=="SIDNEY_CROSBY")
s <- sample.int(nrow(player),10000) # subsample for a fast example
doubleML(x=player[s,-who], d=player[s,who], y=goal$homegoal[s], standardize=FALSE)

TaddyLab/gamlr documentation built on April 17, 2023, 7:23 p.m.