# DSIV: Double-Selection Plus Instrumental Variable Estimator In naivereg: Nonparametric Additive Instrumental Variable Estimator and Related IV Methods

## Description

A three-step approach to estimate the endogenous treatment effect using high-dimensional instruments and double selection. It is applicable in the following scenarios: first, there is a known endogeneity problem for the treatment variable. Second, the treatment effect model has a large number of control variables, such as the large micro survey data.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```DSIV( y, x, z, D, family = c("gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian"), criterion = c("BIC", "EBIC"), alpha = 1, nlambda = 100, ... ) ```

## Arguments

 `y` Response variable, an N x 1 vector. `x` Control variables, an N x p1 matrix. `z` Instrumental variables, an N x p2 matrix. `D` Endogenous treatment variable. `family` Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial" should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class). For family="multinomial", can be a nc>=2 level factor, or a matrix with nc columns of counts or proportions. For either "binomial" or "multinomial", if y is presented as a vector, it will be coerced into a factor. For family="cox", y should be a two-column matrix with columns named 'time' and 'status'. The latter is a binary variable, with '1' indicating death, and '0' indicating right censored. The function Surv() in package survival produces such a matrix. For family="mgaussian", y is a matrix of quantitative responses. `criterion` The criterion by which to select the regularization parameter. One of "BIC", "EBIC", default is "BIC". `alpha` The elasticnet mixing parameter, with 0<=alpha<= 1. alpha=1 is the lasso penalty, and alpha=0 the ridge penalty. `nlambda` The number of lambda values, default is 100. `...` other arguments, see help(glmnet).

## Details

The DS-IV algorithm consists of the following three steps: In the first step, regress the outcome variable y on control variables x using the regularization method, estimate the coefficients beta and select the important control variables set denoted by c1. In the second step, regress the treatment variable d on instrumental variables w and control variables x, estimate the optimal instrument d and obtain the second important control variables set denoted by cx. In the third step, obtain the DS-IV estimator of the endogenous of the endogenous treatment effect based on the estimated optimal instrument d and the union (c3) of the selected control variables.

## Value

An object of type `DSIV` which is a list with the following components:

 `yhat` The estimated value of y. `betaD` The coefficient of endogenous variable D. `betaX` The coefficient of control variables x. `c1` Variable indication of the selected in the first step (control variables x). `cx` Variable indication of selected control variables in the second step. `cz` Variable indication of selected instrumental variables in the second step. `c2` Variable indication of the selected in the second step. The number less than or equal to p1 is an indication of control variables, the number greater than p1 and less than or equal to (p1 + p2) is an indication of instrument variables. `c3` Union of c1 and cx on control variables. `family` Same as above. `criterion` Same as above.

## Author(s)

Qingliang Fan, KongYu He, Wei Zhong

## References

Wei Zhong, Yang Gao, Wei Zhou and Qingliang Fan (2020), “Endogenous Treatment Effect Estimation Using High-Dimensional Instruments and Double Selection”, working paper

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```library(naivereg) data("DSIVdata") y=DSIVdata[,1] x=DSIVdata[,2:51] z=DSIVdata[,52:71] D=DSIVdata[,72] res = DSIV(y,x,z,D,family='gaussian', criterion='EBIC') res\$c1 #Variable indication of the selected in the first step (control variables x). res\$cx #Variable indication of selected control variables in the second step. res\$cz #Variable indication of selected instrumental variables in the second step. res\$c3 #Union of c1 and cx on control variables ```

naivereg documentation built on March 18, 2020, 5:09 p.m.