| npregiv | R Documentation |
npregiv computes nonparametric estimation of an instrumental
regression function \varphi defined by conditional moment
restrictions stemming from a structural econometric model: E [Y -
\varphi (Z,X) | W ] = 0, and involving
endogenous variables Y and Z and exogenous variables
X and instruments W. The function \varphi is the
solution of an ill-posed inverse problem.
When method="Tikhonov", npregiv uses the approach of
Darolles, Fan, Florens and Renault (2011) modified for local
polynomial kernel regression of any order (Darolles et al use local
constant kernel weighting which corresponds to setting p=0; see
below for details). When method="Landweber-Fridman",
npregiv uses the approach of Horowitz (2011) again using local
polynomial kernel regression (Horowitz uses B-spline weighting).
npregiv(y,
z,
w,
x = NULL,
zeval = NULL,
xeval = NULL,
alpha = NULL,
alpha.iter = NULL,
alpha.max = 1e-01,
alpha.min = 1e-10,
alpha.tol = .Machine$double.eps^0.25,
bw = NULL,
constant = 0.5,
iterate.diff.tol = 1.0e-08,
iterate.max = 1000,
iterate.Tikhonov = TRUE,
iterate.Tikhonov.num = 1,
method = c("Landweber-Fridman","Tikhonov"),
nmulti = NULL,
optim.abstol = .Machine$double.eps,
optim.maxattempts = 10,
optim.maxit = 500,
optim.method = c("Nelder-Mead", "BFGS", "CG"),
optim.reltol = sqrt(.Machine$double.eps),
p = 1,
penalize.iteration = TRUE,
random.seed = 42,
return.weights.phi = FALSE,
return.weights.phi.deriv.1 = FALSE,
return.weights.phi.deriv.2 = FALSE,
smooth.residuals = TRUE,
start.from = c("Eyz","EEywz"),
starting.values = NULL,
stop.on.increase = TRUE,
...)
These arguments identify the response, endogenous variables, instruments, exogenous covariates, and evaluation data.
w |
a |
x |
an |
xeval |
an |
y |
a one (1) dimensional numeric or integer vector of dependent data, each
element |
z |
a |
zeval |
a |
These arguments control the Landweber-Fridman iteration path.
constant |
the constant to use when using |
iterate.diff.tol |
the search tolerance for the difference in the stopping rule from
iteration to iteration when using |
iterate.max |
an integer indicating the maximum number of iterations permitted
before termination occurs when using |
iterate.Tikhonov |
a logical value indicating whether to use iterated Tikhonov (one
iteration) or not when using |
iterate.Tikhonov.num |
an integer indicating the number of iterations to conduct when using
|
method |
the regularization method employed (defaults to
|
nmulti |
integer number of times to restart the process of finding extrema of the cross-validation function from different (random) initial points. |
These arguments control numerical optimization for the inverse problem.
optim.abstol |
the absolute convergence tolerance used by |
optim.maxattempts |
maximum number of attempts taken trying to achieve successful
convergence in |
optim.maxit |
maximum number of iterations used by |
optim.method |
method used by the default method is an implementation of that of Nelder and Mead (1965), that uses only function values and is robust but relatively slow. It will work reasonably well for non-differentiable functions. method method |
optim.reltol |
relative convergence tolerance used by |
p |
the order of the local polynomial regression (defaults to
|
These arguments control returned kernel weights, starting values, residual smoothing, and iteration stopping behavior.
penalize.iteration |
a logical value indicating whether to
penalize the norm by the number of iterations or not (default
|
random.seed |
an integer used to seed R's random number generator. This ensures replicability of the numerical search. Defaults to 42. |
return.weights.phi |
a logical value (defaults to |
return.weights.phi.deriv.1 |
a logical value (defaults to |
return.weights.phi.deriv.2 |
a logical value (defaults to |
smooth.residuals |
a logical value indicating whether to
optimize bandwidths for the regression of
|
start.from |
a character string indicating whether to start from
|
starting.values |
a value indicating whether to commence
Landweber-Fridman assuming
|
stop.on.increase |
a logical value (defaults to |
These arguments control Tikhonov regularization and its bandwidth.
alpha |
a numeric scalar that, if supplied, is used rather than numerically
solving for |
alpha.iter |
a numeric scalar that, if supplied, is used for iterated Tikhonov
rather than numerically solving for |
alpha.max |
maximum of search range for |
alpha.min |
minimum of search range for |
alpha.tol |
the search tolerance for |
bw |
an object which, if provided, contains bandwidths and parameters
(obtained from a previous invocation of |
Further arguments are passed to lower-level kernel-sum and estimation routines.
... |
additional arguments supplied to |
Documentation guide: see np.kernels for kernels, np.options for global options, and plot for plotting options.
Tikhonov regularization requires computation of weight matrices of
dimension n\times n which can be computationally costly
in terms of memory requirements and may be unsuitable for large
datasets. Landweber-Fridman will be preferred in such settings as it
does not require construction and storage of these weight matrices
while it also avoids the need for numerical optimization methods to
determine \alpha.
method="Landweber-Fridman" uses an optimal stopping rule based
upon ||E(y|w)-E(\varphi_k(z,x)|w)||^2
. However, if local rather than global
optima are encountered the resulting estimates can be overly noisy. To
best guard against this eventuality set nmulti to a larger
number than the default nmulti=min(2,p) for the first
iteration, where p is the dimension of the current smoothing
problem.
Note that for subsequent Landweber-Fridman iterations, a “warm
start” strategy is employed. The optimal bandwidths from the previous
iteration are used as starting values for the current iteration. The
user-supplied nmulti is respected for all iterations. For
iterations after the first successful one, these optimal bandwidths
serve as the first of the multiple initial points (a warm start),
while any remaining restarts are cold starts. If nmulti is not
explicitly supplied by the user, it defaults to min(2,p) for the first
iteration and to 1 for all subsequent iterations. This strategy
provides a balance between computational efficiency and robustness,
allowing the numerical optimizer to refine the structural bandwidths
as the residuals evolve incrementally while still guarding against
local optima.
When using method="Landweber-Fridman", iteration will terminate
when either the change in the value of
||(E(y|w)-E(\varphi_k(z,x)|w))/E(y|w)||^2
from iteration to iteration is
less than iterate.diff.tol or we hit iterate.max or
||(E(y|w)-E(\varphi_k(z,x)|w))/E(y|w)||^2
stops falling in value and
starts rising.
The option bw= would be useful, say, when bootstrapping is
necessary. Note that when passing bw, it must be obtained from
a previous invocation of npregiv. For instance, if
model.iv was obtained from an invocation of npregiv with
method="Landweber-Fridman", then the following needs to be fed
to the subsequent invocation of npregiv:
model.iv <- npregiv(\dots)
bw <- NULL
bw$bw.E.y.w <- model.iv$bw.E.y.w
bw$bw.E.y.z <- model.iv$bw.E.y.z
bw$bw.resid.w <- model.iv$bw.resid.w
bw$bw.resid.fitted.w.z <- model.iv$bw.resid.fitted.w.z
bw$norm.index <- model.iv$norm.index
foo <- npregiv(\dots,bw=bw)
If, on the other hand model.iv was obtained from an invocation
of npregiv with method="Tikhonov", then the following
needs to be fed to the subsequent invocation of npregiv:
model.iv <- npregiv(\dots)
bw <- NULL
bw$alpha <- model.iv$alpha
bw$alpha.iter <- model.iv$alpha.iter
bw$bw.E.y.w <- model.iv$bw.E.y.w
bw$bw.E.E.y.w.z <- model.iv$bw.E.E.y.w.z
bw$bw.E.phi.w <- model.iv$bw.E.phi.w
bw$bw.E.E.phi.w.z <- model.iv$bw.E.E.phi.w.z
foo <- npregiv(\dots,bw=bw)
Or, if model.iv was obtained from an invocation of
npregiv with either method="Landweber-Fridman" or
method="Tikhonov", then the following would also work:
model.iv <- npregiv(\dots)
foo <- npregiv(\dots,bw=model.iv)
When exogenous predictors x (xeval) are passed, they are
appended to both the endogenous predictors z and the
instruments w as additional columns. If this is not desired,
one can manually append the exogenous variables to z (or
w) prior to passing z (or w), and then they will
only appear among the z or w as desired.
npregiv returns a npregiv object. The generic
functions print, summary, and
plot support objects of this type.
npregiv returns a list with components phi,
phi.mat and either alpha when method="Tikhonov"
or norm.index, norm.stop and convergence when
method="Landweber-Fridman", among others.
In addition, if any of return.weights.* are invoked
(*=1,2), then phi.weights and phi.deriv.*.weights
return weight matrices for computing the instrumental regression and
its partial derivatives. Note that these weights, post multiplied by
the response vector y, will deliver the estimates returned in
phi, phi.deriv.1, and phi.deriv.2 (the latter
only being produced when p is 2 or greater). When invoked with
evaluation data, similar matrices are returned but named
phi.eval.weights and phi.deriv.eval.*.weights. These
weights can be used for constrained estimation, among others.
When method="Landweber-Fridman" is invoked, bandwidth objects
are returned in bw.E.y.w (scalar/vector), bw.E.y.z
(scalar/vector), and bw.resid.w (matrix) and
bw.resid.fitted.w.z, the latter matrices containing bandwidths
for each iteration stored as rows. When method="Tikhonov" is
invoked, bandwidth objects are returned in bw.E.y.w,
bw.E.E.y.w.z, and bw.E.phi.w and bw.E.E.phi.w.z.
This function should be considered to be in ‘beta test’ status until further notice.
Jeffrey S. Racine racinej@mcmaster.ca, Samuele Centorrino samuele.centorrino@univ-tlse1.fr
Carrasco, M. and J.P. Florens and E. Renault (2007), “Linear Inverse Problems in Structural Econometrics Estimation Based on Spectral Decomposition and Regularization,” In: James J. Heckman and Edward E. Leamer, Editor(s), Handbook of Econometrics, Elsevier, 2007, Volume 6, Part 2, Chapter 77, Pages 5633-5751
Darolles, S. and Y. Fan and J.P. Florens and E. Renault (2011), “Nonparametric instrumental regression,” Econometrica, 79, 1541-1565.
Feve, F. and J.P. Florens (2010), “The practice of non-parametric estimation by solving inverse problems: the example of transformation models,” Econometrics Journal, 13, S1-S27.
Florens, J.P. and J.S. Racine and S. Centorrino (2018), “Nonparametric instrumental derivatives,” Journal of Nonparametric Statistics, 30 (2), 368-391.
Fridman, V. M. (1956), “A method of successive approximations for Fredholm integral equations of the first kind,” Uspeskhi, Math. Nauk., 11, 233-334, in Russian.
Horowitz, J.L. (2011), “Applied nonparametric instrumental variables estimation,” Econometrica, 79, 347-394.
Landweber, L. (1951), “An iterative formula for Fredholm integral equations of the first kind,” American Journal of Mathematics, 73, 615-24.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
Li, Q. and J.S. Racine (2004), “Cross-validated Local Linear Nonparametric Regression,” Statistica Sinica, 14, 485-512.
np.kernels, np.options, plot
npregivderiv,npreg
## Not run:
## This illustration was made possible by Samuele Centorrino
## <samuele.centorrino@univ-tlse1.fr>
set.seed(42)
n <- 500
## The DGP is as follows:
## 1) y = phi(z) + u
## 2) E(u|z) != 0 (endogeneity present)
## 3) Suppose there exists an instrument w such that z = f(w) + v and
## E(u|w) = 0
## 4) We generate v, w, and generate u such that u and z are
## correlated. To achieve this we express u as a function of v (i.e. u =
## gamma v + eps)
v <- rnorm(n,mean=0,sd=0.27)
eps <- rnorm(n,mean=0,sd=0.05)
u <- -0.5*v + eps
w <- rnorm(n,mean=0,sd=1)
## In Darolles et al (2011) there exist two DGPs. The first is
## phi(z)=z^2 and the second is phi(z)=exp(-abs(z)) (which is
## discontinuous and has a kink at zero).
fun1 <- function(z) { z^2 }
fun2 <- function(z) { exp(-abs(z)) }
z <- 0.2*w + v
## Generate two y vectors for each function.
y1 <- fun1(z) + u
y2 <- fun2(z) + u
## You set y to be either y1 or y2 (ditto for phi) depending on which
## DGP you are considering:
y <- y1
phi <- fun1
## Sort on z (for plotting)
ivdata <- data.frame(y,z,w)
ivdata <- ivdata[order(ivdata$z),]
rm(y,z,w)
attach(ivdata)
model.iv <- npregiv(y=y,z=z,w=w)
phi.iv <- model.iv$phi
## Now the non-iv local linear estimator of E(y|z)
ll.mean <- fitted(npreg(y~z,regtype="ll"))
## For the plots, restrict focal attention to the bulk of the data
## (i.e. for the plotting area trim out 1/4 of one percent from each
## tail of y and z)
trim <- 0.0025
curve(phi,min(z),max(z),
xlim=quantile(z,c(trim,1-trim)),
ylim=quantile(y,c(trim,1-trim)),
ylab="Y",
xlab="Z",
main="Nonparametric Instrumental Kernel Regression",
lwd=2,lty=1)
points(z,y,type="p",cex=.25,col="grey")
lines(z,phi.iv,col="blue",lwd=2,lty=2)
lines(z,ll.mean,col="red",lwd=2,lty=4)
legend("topright",
c(expression(paste(varphi(z))),
expression(paste("Nonparametric ",hat(varphi)(z))),
"Nonparametric E(y|z)"),
lty=c(1,2,4),
col=c("black","blue","red"),
lwd=c(2,2,2),
bty="n")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.