npiv: Nonparametric Instrumental Variable Estimation and Inference
In JeffreyRacine/npiv: Nonparametric Instrumental Variables Estimation and Inference

npiv	R Documentation

Nonparametric Instrumental Variable Estimation and Inference

Description

npiv performs nonparametric a structural function h0 and its derivatives using a B-spline sieve. It also constructs uniform confidence bands for h0 and its derivative.

Sieve dimensions are determined in a data-dependent way if not provided by the user, via the methods described in Chen, Christensen, and Kankanala (2024). This data-driven choice of sieve dimension ensures estimators of h0 and its derivatives converge at the optimal sup-norm rate. The resulting uniform confidence bands for h0 and its derivatives also converge at the minimax rate up to log factors; see Chen, Christensen, and Kankanala (2024).

If sieve dimensions are provided by the user, npiv implements the bootstrap-based procedure of Chen and Christensen (2018) to construct uniform confidence bands based on undersmoothing for h0 and its derivatives.

The methods in npiv apply to estimation and inference on a nonparametric regression function as a special case.

Usage

npiv(...)

## S3 method for class 'formula'
npiv(formula,
     data=NULL,
     newdata=NULL,
     subset=NULL,
     na.action="na.omit",
     call,
     ...)

## Default S3 method:
npiv(Y,
     X,
     W,
     X.eval=NULL,
     X.grid=NULL,
     alpha=0.05,
     basis=c("tensor","additive","glp"),
     boot.num=99,
     check.is.fullrank=FALSE,
     deriv.index=1,
     deriv.order=1,
     grid.num=50,
     J.x.degree=3,
     J.x.segments=NULL,
     K.w.degree=4,
     K.w.segments=NULL,
     K.w.smooth=2,
     knots=c("uniform","quantiles"),
     progress=TRUE,
     ucb.h=TRUE,
     ucb.deriv=TRUE,
     W.max=NULL,
     W.min=NULL,
     X.min=NULL,
     X.max=NULL,
     ...)

Arguments

`formula`	a symbolic description of the model to be fit.
`data`	an optional data frame containing the variables in the model.
`newdata`	an optional data frame in which to look for variables with which to predict (i.e., predictors in `X` passed in `X.eval` which must contain identically named variables).
`subset`	an optional vector specifying a subset of observations to be used in the fitting process (see additional details about how this argument interacts with data-dependent bases in the ‘Details’ section of the `model.frame` documentation).
`na.action`	a function which indicates what should happen when the data contain NAs. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The ‘factory-fresh’ default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`call`	the original function call (this is passed internally by `npiv`). It is not recommended that the user set this.
`Y`	dependent variable vector.
`X`	matrix of endogenous regressors.
`W`	matrix of instrumental variables. Set `W=X` for nonparametric regression.
`X.eval`	optional matrix of evaluation data for the endogenous regressors.
`X.grid`	optional vector of grid points for `X` when determining model complexity. Default (`X.grid=NULL`) uses 50 equally spaced points (can be changed in `grid.num`) over the support of each `X` variable.
`alpha`	nominal size of the uniform confidence bands. Default is `0.05` for 95% uniform confidence bands.
`basis`	basis type (if `X` or `W` are multivariate), a character string. Options are: `tensor` tensor product basis. Default option. `additive` additive basis for additively separable models. `glp` generalized B-spline polynomial basis.
`boot.num`	number of bootstrap replications.
`check.is.fullrank`	check that `X` and `W` have full rank. Default is `FALSE`.
`deriv.index`	integer indicating the column of `X` for which to compute the derivative.
`deriv.order`	integer indicating the order of derivative to be computed.
`grid.num`	number of grid points for each `X` variable if `X.grid` is not provided.
`J.x.degree`	B-spline degree (integer or vector of integers of length `ncol(X)`) for approximating the structural function. Default is `degree=3` (cubic B-spline).
`J.x.segments`	B-spline number of segments (integer or vector of integers of length `ncol(X)`) for approximating the structural function. Default is `NULL`. If either `J.x.segments=NULL` or `K.w.segments=NULL`, these are both chosen automatically using `npiv_choose_J`.
`K.w.degree`	B-spline degree (integer or vector of integers of lenth `ncol(W)`) for estimating the nonparametric first-stage. Default is `degree=4` (quartic B-spline).
`K.w.segments`	B-spline number of segments (integer or vector of integers of length `ncol(W)`) estimating the nonparametric first stage. Defulat is `NULL`. If either `J.x.segments=NULL` or `K.w.segments=NULL`, these are both chosen automatically using `npiv_choose_J`.
`K.w.smooth`	non-negative integer. Basis for the nonparametric first-stage uses `2^{K.w.smooth}` more B-spline segments for each instrument than the basis approximating the structural function. Default is `2`. Setting `K.w.smooth=0` uses the same number of segments for `X` and `W`.
`knots`	knots type, a character string. Options are: `quantiles` interior knots are placed at equally spaced quantiles (equal number of observations lie in each segment). `uniform` interior knots are placed at equally spoaced intervals over the support of the variable. Default option.
`progress`	whether to display progress bar or not. Default is `TRUE`.
`ucb.h`	whether to compute a uniform confidence band for the structural function. Default is `TRUE`.
`ucb.deriv`	whether to compute a uniform confidence band for the derivative of the structural function. Default is `TRUE`.
`W.min`	lower bound on the support of each `W` variable. Default is `min(W)`.
`W.max`	upper bound on the support of each `W` variable. Default is `max(W)`.
`X.min`	lower bound on the support of each `X` variable. Default is `min(X)`.
`X.max`	upper bound on the support of each `X` variable. Default is `max(X)`.
`...`	optional arguments

Details

npiv estimates and constructs uniform confidence bands for a nonparametric structural function h_0 and its derivatives in the model Y=h_0(X)+U,\quad E[U|W]=0\quad{(\rm almost\, surely).} Estimation is performed using nonparametric two-stage least-squares with a B-spline sieve. The key tuning parameter is the dimension J of the sieve used to approximate h_0. The dimension is tuned via modifying the number and placement of interior knots in the B-spline basis (equivalently, the number of segments of the basis). Sieve dimensions can be user-provided or data-determined using the procedure of Chen, Christensen, and Kankanala (2024).

Typical usages mirror ivreg (see above and below for a list of options and the example at the bottom of this document)

    foo <- npiv(y~x|w)
    foo <- npiv(y~x1+x2|w1+w2)
    foo <- npiv(Y=y,X=x,W=w)

npiv can be used in two ways:

1. Data-driven sieve dimension is invoked if either K.w.segments or J.x.segments are unspecified or NULL (the default). Sieve dimensions are chosen automatically using npiv_choose_J. Uniform confidence bands for h_0 and its derivatives are constructed using the data-driven method of Chen, Christensen, and Kankanala (2024).

2. The user may specify the sieve dimensions of both bases by specifying values for K.w.segments and J.x.segments. Uniform confidence bands for h_0 and its derivatives are constructed using the method of Chen and Christensen (2018).

npiv can also be used for estimation and inference on a nonparametric regression function by setting W=X.

Value

npiv returns a npiv object. The generic function fitted extracts the estimated values for the sample (or evaluation data, if provided), while the generic function residuals extracts the sample residuals. The generic function summary provides a simple model summary. The generic function plot also plots the estimated function and derivative, together with uniform confidence bands.

The function npiv returns a list with the following components:

`h`	estimated structural function evaluated at the sample data (or evaluation data, if provided).
`residuals`	residuals for the sample data.
`deriv`	estimated derivative of the structural function evaluated at the sample data (or evaluation data, if provided).
`asy.se`	pre-asymptotic standard errors for the estimator of the structural function evaluated at the sample data (or evaluation data, if provided)
`deriv.asy.se`	pre-asymptotic standard errors for the estimator of the derivative of the structural function evaluated at the sample data (or evaluation data, if provided).
`deriv.index`	index for the estimated derivative.
`deriv.order`	order of the estimated derivative.
`K.w.degree`	value of `K.w.degree` used.
`K.w.segments`	value of `K.w.segments` used (will be data-determined if not provided).
`J.x.degree`	value of `J.x.degree` used.
`J.x.segments`	value of `J.x.segments` used (will be data-determined if not provided).
`beta`	vector of estimated spline coefficients.

Author(s)

Jeffrey S. Racine <racinej@mcmaster.ca>, Timothy Christensen <timothy.christensen@yale.edu>

References

Chen, X. and T. Christensen (2018). “Optimal Sup-norm Rates and Uniform Inference on Nonlinear Functionals of Nonparametric IV Regression.” Quantitative Economics, 9(1), 39-85. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3982/QE722")}

Chen, X., T. Christensen and S. Kankanala (2024). “Adaptive Estimation and Uniform Confidence Bands for Nonparametric Structural Functions and Elasticities.” Review of Economic Studies, forthcoming. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/restud/rdae025")}

Examples

## load data
data("Engel95", package = "npiv")

## sort on logexp (the regressor) for plotting purposes
Engel95 <- Engel95[order(Engel95$logexp),] 
attach(Engel95)

## Estimate the Engel curve for food using logwages as an instrument
fm1 <- npiv(food ~ logexp | logwages)

## Plot the estimated Engel curve and data-driven uniform confidence bands
plot(logexp,food,
     ylab="Food Budget Share",
     xlab="log(Total Household Expenditure)",
     xlim=c(4.75, 6.25),
     ylim=c(0, 0.4),
     main="",
     type="p",
     cex=.5,
     col="lightgrey")
lines(logexp,fm1$h,col="blue",lwd=2,lty=1)
lines(logexp,fm1$h.upper,col="blue",lwd=2,lty=2)
lines(logexp,fm1$h.lower,col="blue",lwd=2,lty=2)

## Estimate the Engel curve using pre-specified sieve dimension 
## (dimension 5 for logexp, dimension 9 for logwages)
fm2 <- npiv(food ~ logexp | logwages,
            J.x.segments = 2,
            K.w.segments = 5)

## Plot uniform confidence bands based on undersmoothing
lines(logexp,fm2$h.upper,col="red",lwd=2,lty=2)
lines(logexp,fm2$h.lower,col="red",lwd=2,lty=2)

## Plot pointwise confidence bands based on pre-asymptotic standard errors
lines(logexp,fm2$h+1.96*fm2$asy.se,col="red",lwd=2,lty=3)
lines(logexp,fm2$h-1.96*fm2$asy.se,col="red",lwd=2,lty=3)

legend("topright",
       legend=c("Data-driven Estimate",
                "Data-driven UCBs",
                "Undersmoothed UCBs",
                "Pointwise CBs"),
       col=c("blue","blue","red","red"),
       lty=c(1,2,2,3),
       lwd=c(2,2,2,2))

## Plot the data-driven estimate of the derivative of the Engel curve
plot(logexp,fm1$deriv,col="blue",lwd=2,lty=1,type="l",
     ylab="Derivative of Food Budget Share",
     xlab="log(Total Household Expenditure)",
     xlim=c(4.75, 6.25),
     ylim=c(-1,1))

## Plot data-driven uniform confidence bands for the derivative
lines(logexp,fm1$h.upper.deriv,col="blue",lwd=2,lty=2)
lines(logexp,fm1$h.lower.deriv,col="blue",lwd=2,lty=2)

## Plot uniform confidence bands based on undersmoothing
lines(logexp,fm2$h.upper.deriv,col="red",lwd=2,lty=2)
lines(logexp,fm2$h.lower.deriv,col="red",lwd=2,lty=2)

## Plot pointwise confidence bands based on pre-asymptotic standard errors
lines(logexp,fm2$deriv+1.96*fm2$deriv.asy.se,col="red",lwd=2,lty=3)
lines(logexp,fm2$deriv-1.96*fm2$deriv.asy.se,col="red",lwd=2,lty=3)

legend("topright",
       legend=c("Data-driven Estimate",
                "Data-driven UCBs",
                "Undersmoothed UCBs",
                "Pointwise CBs"),
       col=c("blue","blue","red","red"),
       lty=c(1,2,2,3),
       lwd=c(2,2,2,2))

JeffreyRacine/npiv documentation built on Jan. 17, 2025, 8:29 p.m.