DSIV: Double-Selection Plus Instrumental Variable Estimator

Description Usage Arguments Details Value Author(s) References Examples

View source: R/DSIV.R

Description

A three-step approach to estimate the endogenous treatment effect using high-dimensional instruments and double selection. It is applicable in the following scenarios: first, there is a known endogeneity problem for the treatment variable. Second, the treatment effect model has a large number of control variables, such as the large micro survey data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
DSIV(
  y,
  x,
  z,
  D,
  family = c("gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian"),
  criterion = c("BIC", "EBIC"),
  alpha = 1,
  nlambda = 100,
  ...
)

Arguments

y

Response variable, an N x 1 vector.

x

Control variables, an N x p1 matrix.

z

Instrumental variables, an N x p2 matrix.

D

Endogenous treatment variable.

family

Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial" should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class). For family="multinomial", can be a nc>=2 level factor, or a matrix with nc columns of counts or proportions. For either "binomial" or "multinomial", if y is presented as a vector, it will be coerced into a factor. For family="cox", y should be a two-column matrix with columns named 'time' and 'status'. The latter is a binary variable, with '1' indicating death, and '0' indicating right censored. The function Surv() in package survival produces such a matrix. For family="mgaussian", y is a matrix of quantitative responses.

criterion

The criterion by which to select the regularization parameter. One of "BIC", "EBIC", default is "BIC".

alpha

The elasticnet mixing parameter, with 0<=alpha<= 1. alpha=1 is the lasso penalty, and alpha=0 the ridge penalty.

nlambda

The number of lambda values, default is 100.

...

other arguments, see help(glmnet).

Details

The DS-IV algorithm consists of the following three steps: In the first step, regress the outcome variable y on control variables x using the regularization method, estimate the coefficients beta and select the important control variables set denoted by c1. In the second step, regress the treatment variable d on instrumental variables w and control variables x, estimate the optimal instrument d and obtain the second important control variables set denoted by cx. In the third step, obtain the DS-IV estimator of the endogenous of the endogenous treatment effect based on the estimated optimal instrument d and the union (c3) of the selected control variables.

Value

An object of type DSIV which is a list with the following components:

yhat

The estimated value of y.

betaD

The coefficient of endogenous variable D.

betaX

The coefficient of control variables x.

c1

Variable indication of the selected in the first step (control variables x).

cx

Variable indication of selected control variables in the second step.

cz

Variable indication of selected instrumental variables in the second step.

c2

Variable indication of the selected in the second step. The number less than or equal to p1 is an indication of control variables, the number greater than p1 and less than or equal to (p1 + p2) is an indication of instrument variables.

c3

Union of c1 and cx on control variables.

family

Same as above.

criterion

Same as above.

Author(s)

Qingliang Fan, KongYu He, Wei Zhong

References

Wei Zhong, Yang Gao, Wei Zhou and Qingliang Fan (2020), “Endogenous Treatment Effect Estimation Using High-Dimensional Instruments and Double Selection”, working paper

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(naivereg)
data("DSIVdata")
y=DSIVdata[,1]
x=DSIVdata[,2:51]
z=DSIVdata[,52:71]
D=DSIVdata[,72]
res = DSIV(y,x,z,D,family='gaussian', criterion='EBIC')
res$c1 #Variable indication of the selected in the first step (control variables x).
res$cx #Variable indication of selected control variables in the second step.
res$cz #Variable indication of selected instrumental variables in the second step.
res$c3 #Union of c1 and cx on control variables

naivereg documentation built on March 18, 2020, 5:09 p.m.

Related to DSIV in naivereg...