library("knitr")
knitr::opts_knit$set(self.contained = FALSE)
knitr::opts_chunk$set(tidy = TRUE, collapse=TRUE, comment = "#>",
                      tidy.opts=list(blank=FALSE, width.cutoff=55))

Summary

The package ManyIV implements estimators and confidence intervals in a linear instrumental variables model considered in @kolesar18md and @kcfgi14. In this vignette, we demonstrate the implementation of these estimators and confidence intervals using a subset of the dataset used in @ak91, which is included in the package as a data frame ak80. This data frame corresponds to a sample of males born in the US in 1930--39 from 5% sample of the 1980 Census. See help("ManyIV::ak80") for details.

Estimation and Inference

The package implements the following estimators via the command IVreg

  1. Two-stage least-squares (TSLS) estimator
  2. Limited information maximum likelihood (LIML) estimator due to @ar49.
  3. A modification of the bias-corrected two-stage least squares (MBTSLS) estimator (@kcfgi14) that slightly modifies the original @nagar59 estimator so that it's consistent under many exogenous regressors as well as many instruments, provided the reduced-form errors are homoskedastic.
  4. Efficient minimum distance (EMD) estimator (@kolesar18md) that is more efficient than LIML under many instrument asymptotics unless the reduced-form errors are Gaussian.

IVreg computes the following types of standard errors:

  1. Conventional homoskedastic standard errors, as computed by Stata's ivregress and ivreg2. These standard errors are not robust to many instruments (option inference=standard)
  2. Conventional heteroskedastic standard errors, as computed by Stata's ivregress and ivreg2. These standard errors are not robust to many instruments. (option inference=standard)
  3. Standard errors that are valid under heterogeneous treatment effects as well as heteroskedasticity (labeled HTE robust). These standard errors are not robust to many instruments (option inference=standard). They are only computed for TSLS and MBTSLS, since LIML is not robust to heterogeneous treatment effects (see @kolesar13late).
  4. Standard errors based on the information matrix of the limited information likelihood of @ar49 (for LIML only). These are not robust to many instruments or heteroskedasticity (option inference=lil)
  5. Standard errors based on the Hessian of the random-effects likelihood of @ci04. These standard errors are for LIML only (since the random-effects ML estimator coincides to LIML), and are robust to many instruments provided the reduced-form errors are Gaussian and homoskedastic (option inference=re).
  6. Standard errors based on the Hessian of the invariant likelihood (see @kolesar18md). These standard errors are for LIML only (since the invariant ML estimator coincides to LIML), and are robust to many instruments provided the reduced-form errors are Gaussian and homoskedastic. This involves some numerical optimization. (option inference=il)
  7. Many-instrument robust standard errors based on the minimum distance objective function (see @kolesar18md) (option inference=md). Since the TSLS estimator is not consistent under many-instrument asymptotics, its standard errors are omitted. Unlike the re and il standard errors, the standard errors for MBTSLS, LIML and EMD do not require the reduced-form errors to be Gaussian, although the homoskedasticity assumption is still needed. In addition, the command computes standard errors for MBTSLS based on the unrestricted minimum distance objective function (umd), which allows for treatment effect heterogeneity (provided the reduced-form errors remain homoskedastic), and for failures of the exclusion restriction as considered in @kcfgi14.

Several of these options may be specified at once:

library("ManyIV")
## Specification as in Table V, columns (1) and (2) in Angrist and Krueger
IVreg(lwage~education+as.factor(yob)|as.factor(qob)*as.factor(yob),
            data=ak80, inference=c("standard", "re", "il", "lil"))

With large data, the md standard errors may take a while to run, as they require estimation of third and fourth moments of the reduced-form errors. In particular, letting $M$ denote the annihilator matrix associated with the matrix $(W,Z)$ of exogenous regressors and instruments, the formulas for these moments require the computation of $\tilde{m}{3}=\sum{i,j}M_{i,j}^3$, and $\tilde{m}{4}=\sum{i,j}M_{i,j}^4$. If option approx=TRUE is selected (which is the default), to speed up the calculations, the function ivreg uses the approximation $\tilde{m}{3}\approx n-3(k+l)$ and $\tilde{m}{4}\approx n-4(k+l)$, where $n$ is the sample size, $k$ is the number of instruments, and $\ell$ is the number of exogenous regressors. This approximation is accurate up to terms of order $O((k+l)/n)^2)$, and should have a negligible effect on the estimates unless the ratio $(k+l)/n$ is quite large. With this approximation, the calculations are quite fast even for large sample sizes:

r1 <- IVreg(lwage~education+as.factor(yob)|as.factor(qob)*as.factor(yob),
            data=ak80, inference="md", approx=TRUE)
print(r1, digits=4)

We can see that the LIML and EMD estimates are identical up to 4 significant digits.

Specification testing

The package also implements two tests for overidentifying restrictions. The first test is the classic @sargan58 test. The second test is a modification of the @cd93 test developed in @kolesar18md to make the test robust to many instruments and many exogenous regressors (provided the reduced-form errors are homoskedastic). The command IVoverid takes the results of the IV regression as an argument.

IVoverid(r1)

Implementation details

\newcommand{\Hm}[1]{{H}_{#1}}

Let $$y_{i}=x_{i}\beta+w_{i}'\delta+\epsilon_{i},$$ where $y_{i}\in\mathbb{R}$ is the outcome variable, $x_{i}\in\mathbb{R}$ is a single endogenous regressor, $w_{i}\in\mathbb{R}^{\ell}$ is a vector of exogenous regressors (covariates), and $\epsilon_{i}$ is a structural error. The parameter of interest is $\beta$. In addition, $z_{i}\in\mathbb{R}^{k}$ is a vector of instruments.

We observe an i.i.d.~sample ${y_{i},x_{i},w_{i},z_{i}}{i=1}^{n}$. Let $Y$, $Z$, and $W$, denote matrices with rows $(y{i},x_{i})$, $z_{i}'$ and $w_{i}'$. For any full-rank $n\times m$ matrix ${A}$, let $\Hm{{A}}={A}({A}'{A})^{-1}{A}'$ denote the associated $n\times n$ projection matrix (also known as the hat matrix). Let $I_{m}$ denote the $m\times m$ identity matrix, and let ${Z}{\perp}=(I{n}-\Hm{W}){Z}$ denote the residual from the sample projection of ${Z}$ onto $W$.

Define matrices $S$ and $T$ as in @kolesar18md: \begin{align} T&=Y'\Hm{Z_{\perp}}Y/n,& S&=Y'(I_{n}-\Hm{Z,W})Y/(n-k-\ell). \end{align} Also define $m_{\min}$ and $m_{\max}$ to be the minimum and maximum eigenvalues of the matrix $S^{-1}T$. The estimators TSLS, OLS, MBTSLS, and LIML are all $k$-class estimators. A $k$-class estimator estimator with parameter $\kappa$ is then given by \begin{equation} \hat{\beta}(\kappa)=\frac{T_{12}-m(\kappa) S_{12}}{T_{22}-m(\kappa)S_{22}}, \end{equation} where $m(\kappa)=(\kappa-1)(1-k/n-\ell/n)$. For the estimators above, \begin{align} m_{OLS}&=-(1-k/n-\ell/n)& m_{TSLS}&=0,& m_{MBTSLS}&=k/n,& m_{LIML}&=m_{\min}. \end{align} The EMD estimator is not a $k$-class estimator.

The li, lil, re, and md standard errors are based on the formulas described in @kolesar18md. In the remainder of this vignette, we briefly describe the formulas for conventional standard errors.

Other standard errors

Stata 13's ivregress and ivreg2 use standard errors for $k$-class estimators given by \begin{equation} \widehat{var}{\text{Stata}}(\hat{\beta}(\kappa)) = \frac{1}{n}\frac{\hat{\sigma}(\kappa)^{2}}{ T{22}-m(\kappa) S_{22}}, \end{equation} where $\hat{\sigma}(\kappa)^{2}=\hat{\epsilon}(\kappa)'\hat{\epsilon}(\kappa)/n$, with $\hat{\epsilon}(\kappa)=y-x\hat{\beta}(\kappa)-W'\hat{\delta}(\kappa)$, and $\hat{\delta}(\kappa)=(W' W)^{-1} W'(y-x\hat{\beta}(\kappa))$. This includes LIML, for which $\kappa$ is random (Stata disregards that). For OLS, we use the Stata 13 variance estimator $\hat{\sigma}=\hat{\epsilon}{OLS}'\hat{\epsilon}{OLS}/(n-\ell-1)$.

To define the robust standard error estimators, let $\hat{R}{i}=Z{\perp, i}(Z_{\perp}'Z_{\perp})^{-1}Z_{\perp}x$. Then, for a $k$-class estimator (including LIML), \begin{equation} \widehat{var}{\text{Stata, robust}}(\hat{\beta}(\kappa)) =\frac{\sum{i=1}^{n}\hat{\epsilon}{i}(\kappa)^2\hat{R}{i}^{2}}{n^{2}(T_{22}-m(\kappa)S_{22})^{2}}. \end{equation} Note that $\widehat{var}{\text{Stata, robust}}(\hat{\beta}(\kappa))$ and $\widehat{var}{\text{Stata}}(\hat{\beta}(\kappa))$ don't necessarily converge to the same quantity even under homoskedasticity. For OLS, we use $(n/(n-\ell-1))^{1/2}x_{\perp}$ in place of $\hat{R}_{i}$.

One could alternatively use $T_{22}$ in the denominator, or estimate $var(\epsilon_{i})$ using $\hat{\sigma}(\beta)=(1,-\beta)S(1,-\beta)'$. Such variance estimators were used in @kcfgi14. The alternative denominator makes a big difference, but how we estimate $\sigma^{2}$ matters less.

Other outputs

The first-stage $F$-statistic reported by IVreg is given by \begin{equation} F= \frac{n}{k}\frac{T_{22}}{S_{22}}. \end{equation}

The Sargan test statistic is given by $n m_{\min}/(1-p/n-\ell/n+m_{\min}))$, and its $p$-value is based on a $\chi^{2}_{k-1}$ approximation. The Sargan test statistic is based on LIML, unlike in Stata 13's estat overid, where it depends on what estimator was used to compute $\beta$. The adjusted Cragg-Donald test is described in @kolesar18md [Section 6].

References



kolesarm/ManyIV documentation built on March 13, 2021, 6:31 a.m.