Nonparametric Instrumental Regression
Description
crsiv
computes nonparametric estimation of an instrumental
regression function phi defined by conditional moment
restrictions stemming from a structural econometric model: E [Y  phi (Z,X)  W ] = 0, and involving
endogenous variables Y and Z, exogenous variables X,
and instruments W. The function phi is the solution
of an illposed inverse problem.
When method="Tikhonov"
, crsiv
uses the approach of
Darolles, Fan, Florens and Renault (2011) modified for regression
splines (Darolles et al use local constant kernel weighting). When
method="LandweberFridman"
, crsiv
uses the approach of
Horowitz (2011) using the regression spline methodology implemented in
the crs package.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  crsiv(y,
z,
w,
x = NULL,
zeval = NULL,
weval = NULL,
xeval = NULL,
alpha = NULL,
alpha.min = 1e10,
alpha.max = 1e01,
alpha.tol = .Machine$double.eps^0.25,
deriv = 0,
iterate.max = 1000,
iterate.diff.tol = 1.0e08,
constant = 0.5,
penalize.iteration = TRUE,
smooth.residuals = TRUE,
start.from = c("Eyz","EEywz"),
starting.values = NULL,
stop.on.increase = TRUE,
method = c("LandweberFridman","Tikhonov"),
opts = list("MAX_BB_EVAL"=10000,
"EPSILON"=.Machine$double.eps,
"INITIAL_MESH_SIZE"="r1.0e01",
"MIN_MESH_SIZE"=paste("r",sqrt(.Machine$double.eps),sep=""),
"MIN_POLL_SIZE"=paste("r",sqrt(.Machine$double.eps),sep=""),
"DISPLAY_DEGREE"=0),
...)

Arguments
y 
a one (1) dimensional numeric or integer vector of dependent data, each
element i corresponding to each observation (row) i of

z 
a pvariate data frame of endogenous predictors. The data types may be continuous, discrete (unordered and ordered factors), or some combination thereof 
w 
a qvariate data frame of instruments. The data types may be continuous, discrete (unordered and ordered factors), or some combination thereof 
x 
an rvariate data frame of exogenous predictors. The data types may be continuous, discrete (unordered and ordered factors), or some combination thereof 
zeval 
a pvariate data frame of endogenous predictors on which the
regression will be estimated (evaluation data). By default, evaluation
takes place on the data provided by 
weval 
a qvariate data frame of instruments on which the regression
will be estimated (evaluation data). By default, evaluation
takes place on the data provided by 
xeval 
an rvariate data frame of exogenous predictors on which the
regression will be estimated (evaluation data). By default,
evaluation takes place on the data provided by 
alpha 
a numeric scalar that, if supplied, is used rather than numerically
solving for 
alpha.min 
minimum of search range for alpha, the Tikhonov
regularization parameter, when using 
alpha.max 
maximum of search range for alpha, the Tikhonov
regularization parameter, when using 
alpha.tol 
the search tolerance for 
iterate.max 
an integer indicating the maximum number of iterations permitted
before termination occurs when using 
iterate.diff.tol 
the search tolerance for the difference in the stopping rule from
iteration to iteration when using 
constant 
the constant to use when using 
method 
the regularization method employed (default

penalize.iteration 
a logical value indicating whether to
penalize the norm by the number of iterations or not (default

smooth.residuals 
a logical value (defaults to 
start.from 
a character string indicating whether to start from
E(Yz) (default, 
starting.values 
a value indicating whether to commence
LandweberFridman assuming
phi[1]=starting.values (proper
LandweberFridman) or instead begin from E(yz) (defaults to

stop.on.increase 
a logical value (defaults to 
opts 
arguments passed to the NOMAD solver (see 
deriv 
an integer 
... 
additional arguments supplied to 
Details
Tikhonov regularization requires computation of weight matrices of dimension n x n which can be computationally costly in terms of memory requirements and may be unsuitable (i.e. unfeasible) for large datasets. LandweberFridman will be preferred in such settings as it does not require construction and storage of these weight matrices while it also avoids the need for numerical optimization methods to determine alpha, though it does require iteration that may be equally or even more computationally demanding in terms of total computation time.
When using method="LandweberFridman"
, an optimal stopping rule
based upon E(yw)E(phi(z,x)w)^2 is used to terminate
iteration. However, if local rather than global optima are encountered
the resulting estimates can be overly noisy. To best guard against
this eventuality set nmulti
to a larger number than the default
nmulti=5
for crs
when using cv="nomad"
or
instead use cv="exhaustive"
if possible (this may not be
feasible for nontrivial problems).
When using method="LandweberFridman"
, iteration will terminate
when either the change in the value of
(E(yw)E(phi(z,x)w))/E(yw)^2 from iteration to iteration is
less than iterate.diff.tol
or we hit iterate.max
or
(E(yw)E(phi(z,x)w))/E(yw)^2 stops falling in value and
starts rising.
When your problem is a simple one (e.g. univariate Z, W,
and X) you might want to avoid cv="nomad"
and instead use
cv="exhaustive"
since exhaustive search may be feasible (for
degree.max
and segments.max
not overly large). This will
guarantee an exact solution for each iteration (i.e. there will be no
errors arising due to numerical search).
demo(crsiv)
, demo(crsiv_exog)
, and
demo(crsiv_exog_persp)
provide flexible interactive
demonstrations similar to the example below that allow you to modify
and experiment with parameters such as the sample size, method, and so
forth in an interactive session.
Value
crsiv
returns a crs
object. The generic
functions fitted
and residuals
extract
(or generate) estimated values and residuals. Furthermore, the
functions summary
, predict
, and
plot
(options mean=FALSE
, deriv=i
where
i is an integer, ci=FALSE
,
plot.behavior=c("plot","plotdata","data")
) support objects
of this type.
See crs
for details on the return object components.
In addition to the standard crs
components,
crsiv
returns components phi
and either alpha
when method="Tikhonov"
or phi
, phi.mat
,
num.iterations
, norm.stop
, norm.value
and
convergence
when method="LandweberFridman"
.
Note
Using the option deriv=
computes (effectively) the analytical
derivative of the estimated phi(Z,X) and not that
using crsivderiv
, which instead uses the method of
Florens and Racine (2012). Though both are statistically consistent,
practitioners may desire one over the other hence we provide both.
Note
This function should be considered to be in ‘beta test’ status until further notice.
Author(s)
Jeffrey S. Racine racinej@mcmaster.ca, Samuele Centorrino samuele.centorrino@univtlse1.fr
References
Carrasco, M. and J.P. Florens and E. Renault (2007), “Linear Inverse Problems in Structural Econometrics Estimation Based on Spectral Decomposition and Regularization,” In: James J. Heckman and Edward E. Leamer, Editor(s), Handbook of Econometrics, Elsevier, 2007, Volume 6, Part 2, Chapter 77, Pages 56335751
Darolles, S. and Y. Fan and J.P. Florens and E. Renault (2011), “Nonparametric Instrumental Regression,” Econometrica, 79, 15411565.
Feve, F. and J.P. Florens (2010), “The Practice of Nonparametric Estimation by Solving Inverse Problems: The Example of Transformation Models,” Econometrics Journal, 13, S1S27.
Florens, J.P. and J.S. Racine (2012), “Nonparametric Instrumental Derivatives,” Working Paper.
Fridman, V. M. (1956), “A Method of Successive Approximations for Fredholm Integral Equations of the First Kind,” Uspeskhi, Math. Nauk., 11, 233334, in Russian.
Horowitz, J.L. (2011), “Applied Nonparametric Instrumental Variables Estimation,” Econometrica, 79, 347394.
Landweber, L. (1951), “An Iterative Formula for Fredholm Integral Equations of the First Kind,” American Journal of Mathematics, 73, 61524.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
See Also
npreg
, crs
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92  ## Not run:
## This illustration was made possible by Samuele Centorrino
## <samuele.centorrino@univtlse1.fr>
set.seed(42)
n < 1500
## The DGP is as follows:
## 1) y = phi(z) + u
## 2) E(uz) != 0 (endogeneity present)
## 3) Suppose there exists an instrument w such that z = f(w) + v and
## E(uw) = 0
## 4) We generate v, w, and generate u such that u and z are
## correlated. To achieve this we express u as a function of v (i.e. u =
## gamma v + eps)
v < rnorm(n,mean=0,sd=0.27)
eps < rnorm(n,mean=0,sd=0.05)
u < 0.5*v + eps
w < rnorm(n,mean=0,sd=1)
## In Darolles et al (2011) there exist two DGPs. The first is
## phi(z)=z^2 and the second is phi(z)=exp(abs(z)) (which is
## discontinuous and has a kink at zero).
fun1 < function(z) { z^2 }
fun2 < function(z) { exp(abs(z)) }
z < 0.2*w + v
## Generate two y vectors for each function.
y1 < fun1(z) + u
y2 < fun2(z) + u
## You set y to be either y1 or y2 (ditto for phi) depending on which
## DGP you are considering:
y < y1
phi < fun1
## Create an evaluation dataset sorting on z (for plotting)
evaldata < data.frame(y,z,w)
evaldata < evaldata[order(evaldata$z),]
## Compute the nonIV regression spline estimator of E(yz)
model.noniv < crs(y~z,opts=opts)
mean.noniv < predict(model.noniv,newdata=evaldata)
## Compute the IVregression spline estimator of phi(z)
model.iv < crsiv(y=y,z=z,w=w)
phi.iv < predict(model.iv,newdata=evaldata)
## For the plots, restrict focal attention to the bulk of the data
## (i.e. for the plotting area trim out 1/4 of one percent from each
## tail of y and z)
trim < 0.0025
curve(phi,min(z),max(z),
xlim=quantile(z,c(trim,1trim)),
ylim=quantile(y,c(trim,1trim)),
ylab="Y",
xlab="Z",
main="Nonparametric Instrumental Spline Regression",
sub=paste("LandweberFridman: iterations = ", model.iv$num.iterations,sep=""),
lwd=1,lty=1)
points(z,y,type="p",cex=.25,col="grey")
lines(evaldata$z,evaldata$z^2 0.325*evaldata$z,lwd=1,lty=1)
lines(evaldata$z,phi.iv,col="blue",lwd=2,lty=2)
lines(evaldata$z,mean.noniv,col="red",lwd=2,lty=4)
legend(quantile(z,trim),quantile(y,1trim),
c(expression(paste(varphi(z),", E(yz)",sep="")),
expression(paste("Nonparametric ",hat(varphi)(z))),
"Nonparametric E(yz)"),
lty=c(1,2,4),
col=c("black","blue","red"),
lwd=c(1,2,2))
## End(Not run)
