hetErrorsIV: Fitting Linear Models with Endogenous Regressors using...

Description Usage Arguments Details Value References See Also Examples

View source: R/f_heterrorsIV.R


This function estimates the model parameters and associated standard errors for a linear regression model with one endogenous regressor. Identification is achieved through heteroscedastic covariance restrictions within the triangular system as proposed in Lewbel(2012).





A symbolic description of the model to be fitted. See the "Details" section for the exact notation.


A data.frame containing the data of all parts specified in the formula parameter.


Show details about the running of the function.



The method proposed in Lewbel(2012) identifies structural parameters in regression models with endogenous regressors by means of variables that are uncorrelated with the product of heteroskedastic errors. The instruments are constructed as simple functions of the model's data. The method can be applied when no external instruments are available or to supplement external instruments to improve the efficiency of the IV estimator. Consider the model in the equation:


where t=1,..,T indexes either time or cross-sectional units.The endogeneity problem arises from the correlation of Pt and εt. As such: Pt = Ztt, where Zt is a subset of variables in Xt.

The errors, ε and ν, may be correlated with each other. Structural parameters are identified by an ordinary two-stage least squares regression of Y on X and P, using X and [Z-E(Z)]ν as instruments. A vital assumption for identification is that cov(Z,ν2)≠0. The strength of the instrument is proportional to the covariance of (Z-Z̅)ν with ν, which corresponds to the degree of heteroskedasticity of ν with respect to Z (Lewbel 2012).

The assumption that the covariance between Z and the squared error is different from zero can be empirically tested (this is checked in the background when calling the function). If it is zero or close to zero, the instrument is weak, producing imprecise estimates, with large standard errors.

Formula parameter

The formula argument follows a four part notation:

A two-sided formula describing the model (e.g. y ~ X1 + X2 + P), a single endogenous regressor (e.g. P), and the exogenous variables from which the internal instrumental variables should be build (e.g. IIV(X1) + IIV(X2)), each part separated by a single vertical bar (|).

The instrumental variables that should be built are specified as (multiple) functions, one for each instrument. This function is IIV and uses the following arguments:


The exogenous regressors to build the internal instruments from. If more than one is given, separate instruments are built for each.

Note that no argument to IIV is to be supplied as character but as symbols without quotation marks.

Optionally, additional external instrumental variables to also include in the instrumental variable regression can be specified. These external instruments have to be already present in the data and are provided as the fourth right-hand side part of the formula, again separated by a vertical bar.

See the example section for illustrations on how to specify the formula parameter.


Returns an object of classes rendo.ivreg and ivreg, It extends the object returned from function ivreg of package AER and slightly modifies it by adapting the call and formula components. The summary function prints additional diagnostic information as described in documentation for summary.ivreg.

All generic accessor functions for ivreg such as anova, hatvalues, or vcov are available.


Lewbel, A. (2012). Using Heteroskedasticity to Identify and Estimate Mismeasured and Endogenous Regressor Models, Journal of Business & Economic Statistics, 30(1), 67-80.

Angrist, J. and Pischke, J.S. (2009). Mostly Harmless Econometrics: An Empiricists Companion, Princeton University Press.

See Also



# P is the endogenous regressor in all examples
# X1 generates a weak instrument but for the examples
# this is ignored

# 2 IVs, one from X1, one from X2
het <- hetErrorsIV(y~X1+X2+P|P|IIV(X1)+IIV(X2), data=dataHetIV)
# same as above
het <- hetErrorsIV(y~X1+X2+P|P|IIV(X1,X2), data=dataHetIV)

# use X2 as an external IV
het <- hetErrorsIV(y~X1+P|P|IIV(X1)|X2, data=dataHetIV)


REndo documentation built on Sept. 5, 2021, 5:37 p.m.