nplsqreg: Location-Scale Kernel Quantile Regression with Mixed Data...
In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

nplsqreg

R Documentation

Location-Scale Kernel Quantile Regression with Mixed Data Types

Description

nplsqreg computes a location-scale kernel estimate of the conditional quantile function for a one dimensional dependent variable and mixed continuous, unordered factor, and ordered factor explanatory data. Unlike npqreg, which obtains conditional quantiles by inverting an estimated conditional distribution, nplsqreg estimates the requested conditional quantile surface directly using a locally weighted quantile-kernel construction.

Usage

nplsqreg(bws, ...)

## S3 method for class 'formula'
nplsqreg(bws, data = NULL, newdata = NULL, tau = 0.5,
       gradients = FALSE, residuals = FALSE, subset, na.action,
       gradient.order = 1L, ...)

## S3 method for class 'lsqregressionbandwidth'
nplsqreg(bws, txdat = NULL, tydat = NULL,
       tau = bws$tau, ...)

## Default S3 method:
nplsqreg(bws,
       txdat = stop("training data 'txdat' missing"),
       tydat = stop("training data 'tydat' missing"),
       tau = 0.5,
       exdat,
       gradients = FALSE,
       residuals = FALSE,
       gradient.order = 1L,
       ...)

Arguments

Data, Bandwidth Inputs And Formula Interface

These arguments identify the bandwidth specification, formula/data interface, and training data.

`bws`	a formula, an `lsqregressionbandwidth` object returned by `nplsqregbw`, an `rbandwidth` object, a numeric bandwidth vector, or omitted for automatic bandwidth selection. Exact nplsqreg reuse is through the fitted object's `$bws` component; `$reg.bws` is internal regression state.
`data`	an optional data frame, list or environment containing the variables in the model. If not found in `data`, the variables are taken from `environment(bws)`.
`subset`	an optional vector specifying a subset of observations to be used by the formula method.
`na.action`	a function specifying the action to take when missing values are found by the formula method.
`txdat`	a `p`-variate data frame of explanatory data used as training data. Defaults to the training data stored in `bws`.
`tydat`	a one dimensional numeric vector of dependent data. Defaults to the training response stored in `bws`.

Evaluation Data And Returned Quantities

These arguments control where the quantile regression is evaluated and which fitted quantities are returned.

`newdata`	an optional data frame in which to look for evaluation covariates for formula fits. If omitted, the training data are used.
`exdat`	a `p`-variate data frame of evaluation points. By default, evaluation takes place on `txdat`. The native `exdat` argument takes precedence over `newdata` when both are supplied.
`gradients`	a logical value indicating whether gradients and categorical effects of the conditional quantile with respect to the conditioning variables should be computed and returned. Defaults to `FALSE`.
`gradient.order`	for a final `regtype="lp"` fit with `gradients=TRUE`, a positive integer scalar or one positive integer per continuous predictor. Each entry requests that coordinate's derivative order; it does not request mixed partial derivatives. The default is the first derivative. Ordered and unordered predictors retain their existing first-difference effect semantics.
`residuals`	a logical value indicating whether residuals should be returned for training-data fits. Defaults to `FALSE`.

Quantile Index And Additional Controls

These arguments control the requested quantile probability and the bandwidth selection, prediction, or plotting route.

`tau`	a numeric scalar or vector specifying the quantile probability or probabilities `\tau`. Values must lie strictly in `(0,1)`.
`...`	additional arguments supplied to `nplsqregbw` when bandwidths are computed internally, to `npreg` for the final transformed-response fit, or to plotting and prediction methods as appropriate. Common bandwidth-selection controls include `regtype`, `bwtype`, `nmulti`, `degree`, `nomad`, `search.engine`, `tau.search`, `delta`, `scale`, `regtype.pilot`, `nomad.pilot`, and `pilot.args`.

Details

The estimator follows the locally weighted quantile-kernel approach of Racine and Li (2017). Given a conditional scale pilot \hat\sigma(X_i), define

Y_i^\delta = Y_i + \hat\sigma(X_i)\Phi^{-1}(\delta),

where 0 < \delta < 1 and \Phi^{-1} is the standard normal quantile function. For a requested quantile probability \tau, nplsqregbw selects the bandwidths and \delta by leave-one-out check-loss cross-validation. With the selected bandwidths and \delta, nplsqreg then fits a kernel regression of Y_i^\delta on X_i using the ordinary mixed-data machinery in npreg. The fitted mean of the transformed response is the estimated conditional quantile.

The scale pilot is interpreted as a conditional standard deviation. The default pilot estimates the conditional mean, smooths squared residuals, floors the fitted variance before taking square roots, and then uses the resulting positive scale vector in the quantile-kernel transformation. The local-linear residual-variance pilot follows the idea of Fan and Yao (1998); regtype.pilot can be used to select the pilot regression type independently of the final quantile fit.

If tau has length greater than one, tau.search="full" performs a separate bandwidth/\delta search for each quantile while sharing the same pilot scale. The explicit tau.search="refined" route fits the central quantile first and warm-starts the remaining quantiles, recording the search order and warm-start provenance in the returned object. The conservative default is tau.search="full".

Gradients and categorical effects are those returned by the final npreg fit on the transformed response. Thus ordered-factor effects are finite-difference contrasts and unordered-factor effects follow the corresponding mixed-data regression semantics. For a local-polynomial fit, gradient.order is applied coordinate by coordinate to continuous predictors. Availability follows npreg: when at least one requested continuous derivative is available, components whose order exceeds their fitted polynomial degree are returned as NA with a warning; a request with no available continuous derivative component is rejected. No mixed partials, Hessians, derivative tensors, or multiple derivative orders are stored. With vector tau, the existing evaluation-by-coordinate-by-tau array layout is retained. The gradients method extracts exactly the stored order and rejects any different requested order; it does not recompute derivatives after fitting.

Standard errors are the asymptotic standard errors from that final transformed-response regression, conditional on the selected bandwidths, pilot scale, and \delta. For vector tau, per-quantile fits are independent, so estimated quantile curves are not monotonized and may cross in finite samples.

Value

nplsqreg returns an object of class lsqregression. The generic functions fitted, quantile, se, predict, residuals, and gradients extract estimated conditional quantiles, asymptotic standard errors from the transformed-response regression, predictions, residuals when requested, and gradients or categorical effects. The functions summary, print, and plot support objects of this type. Local-polynomial fit objects retain the compact continuous-coordinate derivative order in $gradient.order.

Usage Issues

If you are using data of mixed types, then it is advisable to use the data.frame function to construct your input data and not cbind, since cbind will typically not work as intended on mixed data types and will coerce the data to the same type.

When reusing a bandwidth object without explicit training or evaluation data, the required data are recovered from object-retained state or from the stored call environment when available. This is a convenience for ordinary same-session use, not a guarantee that a saved object is self-contained. For reproducible replay after saveRDS() and readRDS() in a fresh R session, supply the required data arguments explicitly.

Author(s)

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

References

Fan, J. and Q. Yao (1998), “Efficient Estimation of Conditional Variance Functions in Stochastic Regression,” Biometrika, 85, 645-660. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/85.3.645")}

Racine, J.S. and K. Li (2017), “Nonparametric conditional quantile estimation: A locally weighted quantile kernel approach,” Journal of Econometrics, 201, 72-94. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jeconom.2017.06.020")}

Racine, J.S. and I. Van Keilegom (2020), “A smooth nonparametric, multivariate, mixed-data location-scale test,” Journal of Business & Economic Statistics, 38, 784-795. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/07350015.2019.1574227")}

Examples

## Not run: 
data("Italy")

model.q <- nplsqreg(gdp ~ ordered(year), data = Italy,
                    tau = c(0.25, 0.50, 0.75))
plot(model.q)

model.med <- nplsqreg(gdp ~ ordered(year), data = Italy, tau = 0.50)
model.q2 <- nplsqreg(bws = model.med$bws, tau = 0.50)
plot(model.med, gradient = TRUE)

## End(Not run)

np documentation built on July 15, 2026, 1:07 a.m.

np index

Package overview README.md Entropy and Testing with np Getting Started with np

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

np
Nonparametric Kernel Smoothing Methods for Mixed Data Types

nplsqreg: Location-Scale Kernel Quantile Regression with Mixed Data...
In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

Location-Scale Kernel Quantile Regression with Mixed Data Types

Description

Usage

Arguments

Data, Bandwidth Inputs And Formula Interface

Evaluation Data And Returned Quantities

Quantile Index And Additional Controls

Details

Value

Usage Issues

Author(s)

References

See Also

Examples

Related to nplsqreg in np...

R Package Documentation

Browse R Packages

We want your feedback!

np Nonparametric Kernel Smoothing Methods for Mixed Data Types

nplsqreg: Location-Scale Kernel Quantile Regression with Mixed Data... In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

Location-Scale Kernel Quantile Regression with Mixed Data Types

Description

Usage

Arguments

Data, Bandwidth Inputs And Formula Interface

Evaluation Data And Returned Quantities

Quantile Index And Additional Controls

Details

Value

Usage Issues

Author(s)

References

See Also

Examples

Related to nplsqreg in np...

R Package Documentation

Browse R Packages

We want your feedback!

np
Nonparametric Kernel Smoothing Methods for Mixed Data Types

nplsqreg: Location-Scale Kernel Quantile Regression with Mixed Data...
In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types