npquantile | R Documentation |
npquantile
computes smooth quantiles from a univariate
unconditional kernel cumulative distribution estimate given data and,
optionally, a bandwidth specification i.e. a dbandwidth
object
using the bandwidth selection method of Li, Li and Racine (2017).
npquantile(x = NULL,
tau = c(0.01,0.05,0.25,0.50,0.75,0.95,0.99),
num.eval = 10000,
bws = NULL,
f = 1,
...)
x |
a univariate vector of type |
tau |
an optional vector containing the probabilities for quantile(s) to
be estimated (must contain numbers in |
num.eval |
an optional integer specifying the length of the grid on which the
quasi-inverse is computed. Defaults to |
bws |
an optional |
f |
an optional argument fed to |
... |
additional arguments supplied to specify the bandwidth type, kernel
types, bandwidth selection methods, and so on. See
|
Typical usage is
x <- rchisq(100,df=10) npquantile(x)
The quantile function q_\tau
is defined to be the
left-continuous inverse of the distribution function F(x)
,
i.e. q_\tau = \inf\{x: F(x) \ge \tau\}
.
A traditional estimator of q_\tau
is the \tau
th sample
quantile. However, these estimates suffer from lack of efficiency
arising from variability of individual order statistics; see Sheather
and Marron (1990) and Hyndman and Fan (1996) for methods that
interpolate/smooth the order statistics, each of which discussed in
the latter can be invoked through quantile
via
type=j
, j=1,...,9
.
The function npquantile
implements a method for estimating
smooth quantiles based on the quasi-inverse of a npudist
object where F(x)
is replaced with its kernel estimator and
bandwidth selection is that appropriate for such objects; see
Definition 2.3.6, page 21, Nelsen 2006 for a definition of the
quasi-inverse of F(x)
.
For construction of the quasi-inverse we create a grid of evaluation
points based on the function extendrange
along with the
sample quantiles themselves computed from invocation of
quantile
. The coarseness of the grid defined by
extendrange
(which has been passed the option
f=1
) is controlled by num.eval
.
Note that for any value of \tau
less/greater than the
smallest/largest value of F(x)
computed for the evaluation data
(i.e. that outlined in the paragraph above), the quantile returned for
such values is that associated with the smallest/largest value of
F(x)
, respectively.
npquantile
returns a vector of quantiles corresponding
to tau
.
Cross-validated bandwidth selection is used by default
(npudistbw
). For large datasets this can be
computationally demanding. In such cases one might instead consider a
rule-of-thumb bandwidth (bwmethod="normal-reference"
) or,
alternatively, use kd-trees (options(np.tree=TRUE)
along with a
bounded kernel (ckertype="epanechnikov"
)), both of which will
reduce the computational burden appreciably.
Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca
Cheng, M.-Y. and Sun, S. (2006), “Bandwidth selection for kernel quantile estimation,” Journal of the Chinese Statistical Association, 44, 271-295.
Hyndman, R.J. and Fan, Y. (1996), “Sample quantiles in statistical packages,” American Statistician, 50, 361-365.
Li, Q. and J.S. Racine (2017), “Smooth Unconditional Quantile Estimation,” Manuscript.
Li, C. and H. Li and J.S. Racine (2017), “Cross-Validated Mixed Datatype Bandwidth Selection for Nonparametric Cumulative Distribution/Survivor Functions,” Econometric Reviews, 36, 970-987.
Nelsen, R.B. (2006), An Introduction to Copulas, Second Edition, Springer-Verlag.
Sheather, S. and J.S. Marron (1990), “Kernel quantile estimators,” Journal of the American Statistical Association, Vol. 85, No. 410, 410-416.
Yang, S.-S. (1985), “A Smooth Nonparametric Estimator of a Quantile Function,” Journal of the American Statistical Association, 80, 1004-1011.
quantile
for various types of sample quantiles;
ecdf
for empirical distributions of which
quantile
is an inverse; boxplot.stats
and
fivenum
for computing other versions of quartiles;
qlogspline
for logspline density quantiles;
qkde
for alternative kernel quantiles, etc.
## Not run:
## Simulate data from a chi-square distribution
df <- 50
x <- rchisq(100,df=df)
## Vector of quantiles desired
tau <- c(0.01,0.05,0.25,0.50,0.75,0.95,0.99)
## Compute kernel smoothed sample quantiles
npquantile(x,tau)
## Compute sample quantiles using the default method in R (Type 7)
quantile(x,tau)
## True quantiles based on known distribution
qchisq(tau,df=df)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.