| npqreg | R Documentation |
npqreg computes a kernel quantile regression estimate of a one
(1) dimensional dependent variable on p-variate explanatory
data, given a set of evaluation points, training points (consisting of
explanatory data and dependent data), and a bandwidth specification
using the methods of Li and Racine (2008) and Li, Lin and Racine
(2013). A bandwidth specification can be a condbandwidth object,
or a bandwidth vector, bandwidth type and kernel type.
npqreg(bws, ...)
## S3 method for class 'formula'
npqreg(bws, data = NULL, newdata = NULL, ...)
## S3 method for class 'condbandwidth'
npqreg(bws,
txdat = stop("training data 'txdat' missing"),
tydat = stop("training data 'tydat' missing"),
exdat,
tau = 0.5,
gradients = FALSE,
tol = 1.490116e-04,
small = 1.490116e-05,
itmax = 10000,
...)
## Default S3 method:
npqreg(bws, txdat, tydat, nomad = FALSE, ...)
## S3 method for class 'qregression'
predict(object, se.fit = FALSE, ...)
## S3 method for class 'qregression'
plot(x, ...)
These arguments identify the bandwidth specification, formula/data interface, and training data.
bws |
a bandwidth specification. This can be set as a |
data |
an optional data frame, list or environment (or object
coercible to a data frame by |
txdat |
a |
tydat |
a one (1) dimensional numeric or integer vector of dependent data, each
element |
object |
an object of class |
x |
an object of class |
This argument controls the recommended automatic local-polynomial NOMAD route, which jointly selects continuous polynomial degree and bandwidths when conditional-distribution bandwidths are computed inside npqreg.
nomad |
logical shortcut passed through to |
These arguments control where the quantile regression is evaluated and which fitted quantities are returned.
exdat |
a |
gradients |
a logical value indicating that you want gradients of the conditional
quantile with respect to the conditioning variables computed and returned
in the resulting |
newdata |
An optional data frame in which to look for evaluation data. If omitted, the training data are used. |
se.fit |
logical value. If |
tau |
a numeric scalar or vector specifying the quantile probability or
probabilities |
These arguments control the one-dimensional numerical quantile extraction step.
itmax |
integer maximum number of iterations allowed in the one-dimensional
quantile refinement. Defaults to |
small |
minimum interval width used by the one-dimensional quantile
refinement. Defaults to |
tol |
tolerance on the one-dimensional quantile location refinement.
Defaults to |
Further arguments are passed to the bandwidth-selection counterpart, prediction/evaluation route, or plot route as appropriate.
... |
additional arguments supplied to |
Documentation guide: see np.kernels for kernels, np.options for global options, and plot, plot.np for plotting options.
Given a conditional distribution bandwidth object, npqreg
estimates the conditional distribution function
F(y|x) and extracts the requested conditional quantile. For
0 < \tau < 1, the conditional quantile at probability
\tau is
q_\tau(x) = \inf\{y : F(y|x) \ge \tau\}.
Equivalently, q_\tau(x) is a quasi-inverse of the conditional
distribution in the sense of Nelsen (2006): an inverse agrees with
F on the range of F, while outside that range the
generalized inverse is defined by the lower endpoint at which
F reaches or exceeds the requested probability. Numerically,
npqreg inverts the selected conditional distribution estimator
represented by bws. This includes the selected bandwidth type,
kernels, local-polynomial regression type, selected polynomial degree,
basis, and Bernstein-basis setting inherited from npcdistbw.
If the bandwidth object was selected with nomad=TRUE, the returned
conditional-distribution bandwidth object is an LP object: its
regtype/regtype.engine metadata identify the selected
local-polynomial route and its degree/degree.engine metadata
record the selected continuous-coordinate polynomial degree. npqreg,
predict, and plot reuse this stored LP metadata;
plotting additional tau values does not recompute or downgrade the
selected degree.
The one-dimensional inversion is carried out over the observed support
of the dependent variable using the same selected conditional CDF
estimator that is later used for quantile standard errors and gradients.
The arguments tol, small, and itmax control this
one-dimensional refinement.
Let f(y|x) = \partial F(y|x)/\partial y denote the conditional
density. The asymptotic standard error of the conditional quantile is
computed by the first-order delta method,
se\{\hat q_\tau(x)\}
=
\frac{se\{\hat F(\hat q_\tau(x)|x)\}}
{\hat f(\hat q_\tau(x)|x)} ,
using the selected conditional distribution standard-error machinery and the selected conditional density evaluated at the fitted quantile. This corresponds to the quantile variance expression in Li, Lin and Racine (2013).
If gradients=TRUE, npqreg also computes gradients of
the conditional quantile with respect to the conditioning variables
for which gradients are defined. Differentiating
F(q_\tau(x)|x) = \tau gives
\nabla_x q_\tau(x)
=
-\frac{F_x(q_\tau(x)|x)}{f(q_\tau(x)|x)},
where F_x(y|x) is the derivative of the same selected conditional
distribution estimator with respect to x. For regtype="lc",
this uses the local-constant conditional-gradient machinery; for
regtype="ll" it uses the canonical local-polynomial degree-one
route; and for regtype="lp" it uses the selected or supplied
degree vector. The corresponding first-order gradient standard errors
are computed componentwise as
se\{\nabla_x \hat q_\tau(x)\}
=
\frac{se\{\hat F_x(\hat q_\tau(x)|x)\}}
{\hat f(\hat q_\tau(x)|x)} .
When npqreg is called without an explicit bws object, it
first computes conditional distribution bandwidths using
npcdistbw and stores them in the returned object's
bws component. If a scalar tau was used initially and
additional quantiles are later desired as fitted objects, reuse those
selected bandwidths directly, for example
npqreg(bws = fit$bws, tau = c(0.25, 0.5, 0.75)). If the goal
is only to inspect additional quantiles graphically, use
plot(fit, tau = c(0.25, 0.5, 0.75)); this reuses the stored
bandwidths and recomputes only the one-dimensional quantile extraction
step for the requested tau values. Vector-tau plots are
overlaid and include a legend; use legend=FALSE,
legend=NULL, or a legend=list(...) control to suppress
or customize it.
The predict method follows the usual S3
newdata convention. For formula fits, supply a data frame of
evaluation covariates via predict(fit, newdata=...). For
non-formula fits, newdata is translated to the native
evaluator argument exdat when exdat is not supplied.
The native exdat argument remains available for advanced
workflows and takes precedence if both newdata and
exdat are supplied. If tau is omitted in
predict, the fitted object's stored tau value is used.
npqreg returns a qregression object. The generic
functions fitted (or quantile),
se, predict, and gradients
extract (or generate) estimated values, asymptotic standard errors on
estimates, predictions, and gradients, respectively, from the returned
object. predict uses the object's stored tau
value by default; supply tau= to override it. Furthermore, the functions
summary and plot support objects of this
type. The returned object has the following components:
eval |
evaluation points |
quantile |
estimation of the quantile regression function
(conditional quantile) at the evaluation points. If |
quanterr |
asymptotic standard errors of the quantile
regression estimates, obtained from the conditional distribution
standard error and the estimated conditional density at the fitted
quantile. If |
quantgrad |
gradients of the conditional quantile with respect
to the conditioning variables at each evaluation point, when
|
quantgerr |
asymptotic standard errors for gradients, when
|
tau |
the quantile probability or probabilities computed |
The conditional quantile target is the generalized inverse
q_\tau(x)=\inf\{y:F(y\mid x)\ge \tau\} of the conditional
distribution. The standard errors and gradients described above are
first-order delta-method quantities evaluated using the same selected
conditional CDF, conditional density, bandwidths, kernels, and
local-polynomial degree inherited from the supplied
npcdistbw object.
For book-length derivations, see Li and Racine (2007), Chapter 6 Conditional CDF and Quantile Estimation, especially Sections 6.3-6.5, and Racine (2019), Chapter 4 Conditional Probability Density and Cumulative Distribution Functions. The quasi-inverse terminology follows Nelsen (2006).
If you are using data of mixed types, then it is advisable to use the
data.frame function to construct your input data and not
cbind, since cbind will typically not work as
intended on mixed data types and will coerce the data to the same
type.
Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca
Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.
Hall, P. and J.S. Racine and Q. Li (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015-1026.
Koenker, R. W. and G.W. Bassett (1978), “Regression quantiles,” Econometrica, 46, 33-50.
Koenker, R. (2005), Quantile Regression, Econometric Society Monograph Series, Cambridge University Press.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
Li, Q. and J.S. Racine (2008), “Nonparametric estimation of conditional CDF and quantile functions with mixed categorical and continuous data,” Journal of Business and Economic Statistics, 26, 423-434.
Li, Q. and J. Lin and J.S. Racine (2013), “Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions”, Journal of Business and Economic Statistics, 31, 57-65.
Nelsen, R.B. (2006), An Introduction to Copulas, Second Edition, Springer.
Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.
np.kernels, np.options, plot, plot.np, quantreg
## Not run:
# EXAMPLE 1 (INTERFACE=FORMULA): For this example, we compute a
# bivariate nonparametric quantile regression estimate for Giovanni
# Baiocchi's Italian income panel (see Italy for details)
data("Italy")
with(Italy, {
# Compute conditional distribution bandwidths and extract three
# conditional quantiles using the same selected bandwidths.
model.q <- npqreg(gdp~ordered(year), tau=c(0.25, 0.50, 0.75))
# Plot the overlaid quantiles.
plot(model.q)
# If a scalar tau was used first, additional quantiles can reuse the
# selected bandwidths without recomputing cross-validation. Use npqreg()
# when the additional fitted values are needed as an object, or plot()
# when graphical inspection is all that is desired.
model.med <- npqreg(gdp~ordered(year), tau=0.50)
model.q <- npqreg(bws=model.med$bws, tau=c(0.25, 0.50, 0.75))
plot(model.med, tau=c(0.25, 0.50, 0.75))
})
# EXAMPLE 1 (INTERFACE=DATA FRAME): For this example, we compute a
# bivariate nonparametric quantile regression estimate for Giovanni
# Baiocchi's Italian income panel (see Italy for details)
data("Italy")
with(Italy, {
data <- data.frame(ordered(year), gdp)
# First, compute the likelihood cross-validation bandwidths (default).
# Note - this may take a few minutes depending on the speed of your
# computer...
bw <- npcdistbw(xdat=ordered(year), ydat=gdp)
# Note - numerical search for computing the quantiles will take a
# minute or so...
model.q <- npqreg(bws=bw, tau=c(0.25, 0.50, 0.75))
plot(model.q)
})
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.