ivmodel: Fitting Instrumental Variables (IV) Models

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/ivmodel.r

Description

ivmodel fits an instrumental variables (IV) model with one endogenous variable and a continuous outcome. It carries out several IV regressions, diagnostics, and tests associated this IV model. It is robust to most data formats, including factor and character data, and can handle very large IV models efficiently.

Usage

1
2
3
4
5
ivmodel(Y, D, Z, X, intercept = TRUE, 
        beta0 = 0, alpha = 0.05, k = c(0, 1), 
        heteroSE = FALSE, clusterID = NULL,
        deltarange = NULL, na.action = na.omit)
 

Arguments

Y

A numeric vector of outcomes.

D

A vector of endogenous variables.

Z

A matrix or data frame of instruments.

X

A matrix or data frame of (exogenous) covariates.

intercept

Should the intercept be included? Default is TRUE.

beta0

Null value β_0 for testing null hypothesis H_0: β = β_0 in ivmodel. Default is $0$.

alpha

The significance level for hypothesis testing. Default is 0.05.

k

A numeric vector of k values for k-class estimation. Default is 0 (OLS) and 1 (TSLS).

heteroSE

Should heteroscedastic-robust standard errors be used? Default is FALSE.

clusterID

If cluster-robust standard errors are desired, provide a vector of length that's identical to the sample size. For example, if n = 6 and clusterID = c(1,1,1,2,2,2), there would be two clusters where the first cluster is formed by the first three observations and the second cluster is formed by the last three observations. clusterID can be numeric, character, or factor.

deltarange

Range of δ for sensitivity analysis with the Anderson-Rubin (1949) test.

na.action

NA handling. There are na.fail, na.omit, na.exclude, na.pass available. Default is na.omit.

Details

Let Y, D, X, and Z represent the outcome, endogenous variable, p dimensional exogenous covariates, and L dimensional instruments, respectively; note that the intercept can be considered as a vector of ones and a part of the exogenous covariates X. ivmodel assumes the following IV model

Y = X α + D β + ε, E(ε | X, Z) = 0

and produces statistics for β. In particular, ivmodel computes the OLS, TSLS, k-class, limited information maximum likelihood (LIML), and Fuller-k (Fuller 1977) estimates of β using KClass, LIML, and codeFuller. Also, ivmodel computes confidence intervals and hypothesis tests of the type H_0: β = β_0 versus H_0: β \neq β_0 for the said estimators as well as two weak-IV confidence intervals, Anderson and Rubin (Anderson and Rubin 1949) confidence interval (Anderson and Rubin 1949) and the conditional likelihood ratio confidence interval (Moreira 2003). Finally, the code also conducts a sensitivity analysis if Z is one-dimensional (i.e. there is only one instrument) using the method in Jiang et al. (2015).

Some procedures (e.g. conditional likelihood ratio test, sensitivity analysis with Anderson-Rubin) assume an additional linear model

D = Z γ + X κ + ξ, E(ξ | X, Z) = 0

Value

ivmodel returns an object of class "ivmodel".

An object class "ivmodel" is a list containing the following components

alpha

Significance level for the hypothesis tests.

beta0

Null value of the hypothesis tests.

kClass

A list from KClass function.

LIML

A list from LIML function.

Fuller

A list from Fuller function.

AR

A list from AR.test.

CLR

A list from CLR.

In addition, if there is only one instrument, ivreg will generate an "ARsens" list within "ivmodel" object.

Author(s)

Yang Jiang, Hyunseung Kang, and Dylan Small

References

Anderson, T. W. and Rubin, H. (1949). Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics 20, 46-63.

Freeman G, Cowling BJ, Schooling CM (2013). Power and Sample Size Calculations for Mendelian Randomization Studies Using One Genetic Instrument. International journal of epidemiology, 42(4), 1157-1163.

Fuller, W. (1977). Some properties of a modification of the limited information estimator. Econometrica, 45, 939-953.

Wang, X., Jiang, Y., Small, D. and Zhang, N (2017), Sensitivity analysis and power for instrumental variable studies, (under review of Biometrics).

Moreira, M. J. (2003). A conditional likelihood ratio test for structural models. Econometrica 71, 1027-1048.

Sargan, J. D. (1958). The estimation of economic relationships using instrumental variables. Econometrica , 393-415.

See Also

See also KClass, LIML, Fuller, AR.test, and CLR for individual methods associated with ivmodel. For sensitivity analysis with the AR test, see ARsens.test. ivmodel has summary.ivmodel, confint.ivmodel, fitted.ivmodel, residuals.ivmodel and coef.ivmodel methods associated with it.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(card.data)
# One instrument #
Y=card.data[,"lwage"]
D=card.data[,"educ"]
Z=card.data[,"nearc4"]
Xname=c("exper", "expersq", "black", "south", "smsa", "reg661", 
        "reg662", "reg663", "reg664", "reg665", "reg666", "reg667", 
		"reg668", "smsa66")
X=card.data[,Xname]
card.model1IV = ivmodel(Y=Y,D=D,Z=Z,X=X)
card.model1IV

# Multiple instruments
Z = card.data[,c("nearc4","nearc2")]
card.model2IV = ivmodel(Y=Y,D=D,Z=Z,X=X)
card.model2IV

Example output

Call:
ivmodel(Y = Y, D = D, Z = Z, X = X)
sample size: 3010
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

First Stage Regression Result:

F=13.25579, df1=1, df2=2994, p-value is 0.00027634
R-squared=0.004407934,   Adjusted R-squared=0.004075405
Residual standard error: 1.940537 on 2995 degrees of freedom
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Coefficients of k-Class Estimators:

              k Estimate Std. Error t value Pr(>|t|)    
OLS    0.000000 0.074693   0.003498  21.351   <2e-16 ***
Fuller 0.999666 0.127501   0.052708   2.419   0.0156 *  
LIML   1.000000 0.131504   0.054964   2.393   0.0168 *  
TSLS   1.000000 0.131504   0.054964   2.393   0.0168 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Alternative tests for the treatment effect under H_0: beta=0.

Anderson-Rubin test:
F=5.415279, df1=1, df2=2994, p-value=0.020028
95 percent confidence interval:
 [ 0.02480483596507 , 0.284823593339102 ]

Conditional Likelihood Ratio test:
Test Stat=5.415279, p-value=0.020028
95 percent confidence interval:
 [0.0248043722947519, 0.284824550721994]
Warning messages:
1: In qT * sin(x)^2 :
  Recycling array of length 1 in array-vector arithmetic is deprecated.
  Use c() or as.vector() instead.

2: In qT * sin(x)^2/m :
  Recycling array of length 1 in vector-array arithmetic is deprecated.
  Use c() or as.vector() instead.

3: In (qT + m)/(1 + qT * sin(x)^2/m) :
  Recycling array of length 1 in array-vector arithmetic is deprecated.
  Use c() or as.vector() instead.


Call:
ivmodel(Y = Y, D = D, Z = Z, X = X)
sample size: 3010
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

First Stage Regression Result:

F=7.893096, df1=2, df2=2993, p-value is 0.00038114
R-squared=0.005246698,   Adjusted R-squared=0.004581978
Residual standard error: 1.940044 on 2995 degrees of freedom
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Sargan Test Result:

Sargan Test Statistics=1.248153, df=1, p-value is 0.26391
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Coefficients of k-Class Estimators:

              k Estimate Std. Error t value Pr(>|t|)    
OLS    0.000000 0.074693   0.003498  21.351  < 2e-16 ***
TSLS   1.000000 0.157059   0.052578   2.987  0.00284 ** 
Fuller 1.000075 0.158259   0.053079   2.982  0.00289 ** 
LIML   1.000409 0.164028   0.055495   2.956  0.00314 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Alternative tests for the treatment effect under H_0: beta=0.

Anderson-Rubin test:
F=5.243935, df1=2, df2=2993, p-value=0.0053281
95 percent confidence interval:
 [ 0.05360026100892 , 0.361980791254619 ]

Conditional Likelihood Ratio test:
Test Stat=9.262454, p-value=0.003463
95 percent confidence interval:
 [0.0621199910210952, 0.336180869926706]

ivmodel documentation built on Nov. 17, 2017, 4:09 a.m.