npindex  R Documentation 
npindex
computes a semiparametric single index model
for a dependent variable and pvariate explanatory data using
the model Y = G(XB) + epsilon, given a
set of evaluation points, training points (consisting of explanatory
data and dependent data), and a npindexbw
bandwidth
specification. Note that for this semiparametric estimator, the
bandwidth object contains parameters for the single index model and
the (scalar) bandwidth for the index function.
npindex(bws, ...) ## S3 method for class 'formula' npindex(bws, data = NULL, newdata = NULL, y.eval = FALSE, ...) ## S3 method for class 'call' npindex(bws, ...) ## Default S3 method: npindex(bws, txdat, tydat, ...) ## S3 method for class 'sibandwidth' npindex(bws, txdat = stop("training data 'txdat' missing"), tydat = stop("training data 'tydat' missing"), exdat, eydat, gradients = FALSE, residuals = FALSE, errors = FALSE, boot.num = 399, ...)
bws 
a bandwidth specification. This can be set as a

gradients 
a logical value indicating that you want gradients and the
asymptotic covariance matrix for beta computed and returned in the
resulting 
residuals 
a logical value indicating that you want residuals computed and
returned in the resulting 
errors 
a logical value indicating that you want (bootstrapped)
standard errors for the conditional mean, gradients (when

boot.num 
an integer specifying the number of bootstrap replications to use
when performing standard error calculations. Defaults to

... 
additional arguments supplied to specify the parameters to the

data 
an optional data frame, list or environment (or object
coercible to a data frame by 
newdata 
An optional data frame in which to look for evaluation data. If omitted, the training data are used. 
y.eval 
If 
txdat 
a pvariate data frame of explanatory data (training data) used to calculate the regression estimators. Defaults to the training data used to compute the bandwidth object. 
tydat 
a one (1) dimensional numeric or integer vector of dependent data, each
element i corresponding to each observation (row) i of

exdat 
a pvariate data frame of points on which the regression will be
estimated (evaluation data). By default,
evaluation takes place on the data provided by 
eydat 
a one (1) dimensional numeric or integer vector of the true values of the dependent variable. Optional, and used only to calculate the true errors. 
A matrix of gradients along with average derivatives are computed and
returned if gradients=TRUE
is used.
npindex
returns a npsingleindex
object. The generic
functions fitted
, residuals
,
coef
, vcov
, se
,
predict
, and gradients
, extract (or
generate) estimated values, residuals, coefficients,
variancecovariance matrix, bootstrapped standard errors on estimates,
predictions, and gradients, respectively, from the returned
object. Furthermore, the functions summary
and
plot
support objects of this type. The returned object
has the following components:
eval 
evaluation points 
mean 
estimates of the regression function (conditional mean) at the evaluation points 
beta 
the model coefficients 
betavcov 
the asymptotic covariance matrix for the model coefficients 
merr 
standard errors of the regression function estimates 
grad 
estimates of the gradients at each evaluation point 
gerr 
standard errors of the gradient estimates 
mean.grad 
mean (average) gradient over the evaluation points 
mean.gerr 
bootstrapped standard error of the mean gradient estimates 
R2 
if 
MSE 
if 
MAE 
if 
MAPE 
if 
CORR 
if 
SIGN 
if 
confusion.matrix 
if 
CCR.overall 
if 
CCR.byoutcome 
if 
fit.mcfadden 
if 
If you are using data of mixed types, then it is advisable to use the
data.frame
function to construct your input data and not
cbind
, since cbind
will typically not work as
intended on mixed data types and will coerce the data to the same
type.
vcov
requires that gradients=TRUE
be set.
Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca
Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413420.
Doksum, K. and A. Samarov (1995), “Nonparametric estimation of global functionals and a measure of the explanatory power of covariates regression,” The Annals of Statistics, 23 14431473.
Ichimura, H., (1993), “Semiparametric least squares (SLS) and weighted SLS estimation of singleindex models,” Journal of Econometrics, 58, 71120.
Klein, R. W. and R. H. Spady (1993), “An efficient semiparametric estimator for binary response models,” Econometrica, 61, 387421.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
McFadden, D. and C. Puig and D. Kerschner (1977), “Determinants of the longrun demand for electricity,” Proceedings of the American Statistical Association (Business and Economics Section), 109117.
Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301309.
## Not run: # EXAMPLE 1 (INTERFACE=FORMULA): Generate a simple linear model then # estimate it using a semiparametric single index specification and # Ichimura's nonlinear least squares coefficients and bandwidth # (default). Also compute the matrix of gradients and average derivative # estimates. set.seed(12345) n < 100 x1 < runif(n, min=1, max=1) x2 < runif(n, min=1, max=1) y < x1  x2 + rnorm(n) # Note  this may take a minute or two depending on the speed of your # computer. Note also that the first element of the vector beta is # normalized to one for identification purposes, and that X must contain # at least one continuous variable. bw < npindexbw(formula=y~x1+x2) summary(bw) model < npindex(bws=bw, gradients=TRUE) summary(model) # Sleep for 5 seconds so that we can examine the output... Sys.sleep(5) # Or you can visualize the input with plot. plot(bw) Sys.sleep(5) # EXAMPLE 1 (INTERFACE=DATA FRAME): Generate a simple linear model then # estimate it using a semiparametric single index specification and # Ichimura's nonlinear least squares coefficients and bandwidth # (default). Also compute the matrix of gradients and average derivative # estimates. set.seed(12345) n < 100 x1 < runif(n, min=1, max=1) x2 < runif(n, min=1, max=1) y < x1  x2 + rnorm(n) X < cbind(x1, x2) # Note  this may take a minute or two depending on the speed of your # computer. Note also that the first element of the vector beta is # normalized to one for identification purposes, and that X must contain # at least one continuous variable. bw < npindexbw(xdat=X, ydat=y) summary(bw) model < npindex(bws=bw, gradients=TRUE) summary(model) # Sleep for 5 seconds so that we can examine the output... Sys.sleep(5) # Or you can visualize the input with plot. plot(bw) Sys.sleep(5) # EXAMPLE 2 (INTERFACE=FORMULA): Generate a simple binary outcome linear # model then estimate it using a semiparametric single index # specification and Klein and Spady's likelihoodbased coefficients and # bandwidth (default). Also compute the matrix of gradients and average # derivative estimates. n < 100 x1 < runif(n, min=1, max=1) x2 < runif(n, min=1, max=1) y < ifelse(x1 + x2 + rnorm(n) > 0, 1, 0) # Note that the first element of the vector beta is normalized to one # for identification purposes, and that X must contain at least one # continuous variable. bw < npindexbw(formula=y~x1+x2, method="kleinspady") summary(bw) model < npindex(bws=bw, gradients=TRUE) # Note that, since the outcome is binary, we can assess model # performance using methods appropriate for binary outcomes. We look at # the confusion matrix, various classification ratios, and McFadden et # al's measure of predictive performance. summary(model) # Sleep for 5 seconds so that we can examine the output... Sys.sleep(5) # EXAMPLE 2 (INTERFACE=DATA FRAME): Generate a simple binary outcome # linear model then estimate it using a semiparametric single index # specification and Klein and Spady's likelihoodbased coefficients and # bandwidth (default). Also compute the matrix of gradients and average # derivative estimates. n < 100 x1 < runif(n, min=1, max=1) x2 < runif(n, min=1, max=1) y < ifelse(x1 + x2 + rnorm(n) > 0, 1, 0) X < cbind(x1, x2) # Note that the first element of the vector beta is normalized to one # for identification purposes, and that X must contain at least one # continuous variable. bw < npindexbw(xdat=X, ydat=y, method="kleinspady") summary(bw) model < npindex(bws=bw, gradients=TRUE) # Note that, since the outcome is binary, we can assess model # performance using methods appropriate for binary outcomes. We look at # the confusion matrix, various classification ratios, and McFadden et # al's measure of predictive performance. summary(model) # Sleep for 5 seconds so that we can examine the output... Sys.sleep(5) # EXAMPLE 3 (INTERFACE=FORMULA): Replicate the DGP of Klein & Spady # (1993) (see their description on page 405, pay careful attention to # footnote 6 on page 405). set.seed(123) n < 1000 # x1 is chisquared having 3 df truncated at 6 standardized by # subtracting 2.348 and dividing by 1.511 x < rchisq(n, df=3) x1 < (ifelse(x < 6, x, 6)  2.348)/1.511 # x2 is normal (0, 1) truncated at + 2 divided by 0.8796 x < rnorm(n) x2 < ifelse(abs(x) < 2 , x, 2) / 0.8796 # y is 1 if y* > 0, 0 otherwise. y < ifelse(x1 + x2 + rnorm(n) > 0, 1, 0) # Compute the parameter vector and bandwidth. Note that the first # element of the vector beta is normalized to one for identification # purposes, and that X must contain at least one continuous variable. bw < npindexbw(formula=y~x1+x2, method="kleinspady") # Next, create the evaluation data in order to generate a perspective # plot # Create an evaluation data matrix x1.seq < seq(min(x1), max(x1), length=50) x2.seq < seq(min(x2), max(x2), length=50) X.eval < expand.grid(x1=x1.seq, x2=x2.seq) # Now evaluate the single index model on the evaluation data fit < fitted(npindex(exdat=X.eval, eydat=rep(1, nrow(X.eval)), bws=bw)) # Finally, coerce the fitted model into a matrix suitable for 3D # plotting via persp() fit.mat < matrix(fit, 50, 50) # Generate a perspective plot similar to Figure 2 b of Klein and Spady # (1993) persp(x1.seq, x2.seq, fit.mat, col="white", ticktype="detailed", expand=0.5, axes=FALSE, box=FALSE, main="Estimated Semiparametric Probability Perspective", theta=310, phi=25) # EXAMPLE 3 (INTERFACE=DATA FRAME): Replicate the DGP of Klein & Spady # (1993) (see their description on page 405, pay careful attention to # footnote 6 on page 405). set.seed(123) n < 1000 # x1 is chisquared having 3 df truncated at 6 standardized by # subtracting 2.348 and dividing by 1.511 x < rchisq(n, df=3) x1 < (ifelse(x < 6, x, 6)  2.348)/1.511 # x2 is normal (0, 1) truncated at + 2 divided by 0.8796 x < rnorm(n) x2 < ifelse(abs(x) < 2 , x, 2) / 0.8796 # y is 1 if y* > 0, 0 otherwise. y < ifelse(x1 + x2 + rnorm(n) > 0, 1, 0) # Create the X matrix X < cbind(x1, x2) # Compute the parameter vector and bandwidth. Note that the first # element of the vector beta is normalized to one for identification # purposes, and that X must contain at least one continuous variable. bw < npindexbw(xdat=X, ydat=y, method="kleinspady") # Next, create the evaluation data in order to generate a perspective # plot # Create an evaluation data matrix x1.seq < seq(min(x1), max(x1), length=50) x2.seq < seq(min(x2), max(x2), length=50) X.eval < expand.grid(x1=x1.seq, x2=x2.seq) # Now evaluate the single index model on the evaluation data fit < fitted(npindex(exdat=X.eval, eydat=rep(1, nrow(X.eval)), bws=bw)) # Finally, coerce the fitted model into a matrix suitable for 3D # plotting via persp() fit.mat < matrix(fit, 50, 50) # Generate a perspective plot similar to Figure 2 b of Klein and Spady # (1993) persp(x1.seq, x2.seq, fit.mat, col="white", ticktype="detailed", expand=0.5, axes=FALSE, box=FALSE, main="Estimated Semiparametric Probability Perspective", theta=310, phi=25) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.