| np.kernels | R Documentation |
Summary of continuous, unordered-categorical, and ordered-categorical kernels used by np (including higher-order continuous kernels and compact-support variants used in C-level code paths).
Documentation guide: see np.options for global options
and plot for plotting options.
Kernel option names used in np:
Continuous kernels:
ckertype (and ckerorder, ckerbound where applicable).
Unordered kernels:
ukertype.
Ordered kernels:
okertype.
Conditional density/distribution bandwidth objects split kernel
choices by response and regressor blocks:
cykertype/cxkertype,
uykertype/uxkertype,
oykertype/oxkertype
(with matching order/bound options for continuous kernels).
Let u = (x_i-x)/h for continuous variables.
Continuous kernels (called via ckertype):
K_{G,2}(u)=\phi(u)
K_{G,4}(u)=\left(\frac{3}{2}-\frac{1}{2}u^2\right)\phi(u)
K_{G,6}(u)=\left(\frac{15}{8}-\frac{5}{4}u^2+\frac{1}{8}u^4\right)\phi(u)
K_{G,8}(u)=\left(\frac{35}{16}-\frac{35}{16}u^2+\frac{7}{16}u^4-\frac{1}{48}u^6\right)\phi(u)
where \phi(u) is the standard normal density.
ckertype="gaussian" with ckerorder=2,4,6,8.
The compact-support Epanechnikov-family kernels implemented in C use
support |u|<\sqrt{5}:
K_{E,2}(u)=\frac{3}{4\sqrt{5}}\left(1-\frac{u^2}{5}\right)\mathbf{1}(|u|<\sqrt{5})
K_{E,4}(u)=0.008385254916(-15+7u^2)(-5+u^2)\mathbf{1}(u^2<5)
K_{E,6}(u)=0.33541019662496845446\left(2.734375-3.28125u^2+0.721875u^4\right)\left(1-0.2u^2\right)\mathbf{1}(u^2<5)
K_{E,8}(u)=0.33541019662496845446\left(3.5888671875-7.8955078125u^2+4.1056640625u^4-0.5865234375u^6\right)\left(1-0.2u^2\right)\mathbf{1}(u^2<5)
ckertype="epanechnikov" with ckerorder=2,4,6,8.
Uniform (rectangular) kernel:
K_U(u)=\frac{1}{2}\mathbf{1}(|u|<1)
via ckertype="uniform" (order ignored).
Truncated-Gaussian (second-order) kernel via
ckertype="truncated gaussian":
K_{TG,2}(u)=\left[\alpha\phi(u)-c_0\right]\mathbf{1}(|u|<b)
with defaults b=3 and internal constants calibrated in C.
Bounded continuous-kernel normalization (ckerbound and, for
conditional objects, cxkerbound/cykerbound) reuses the
selected continuous kernel and renormalizes it on the declared support.
For a base kernel K and support [a,b], the bounded kernel is
K_{[a,b]}(u;x,h)=\frac{K(u)}{\int_{(a-x)/h}^{(b-x)/h}K(t)dt}
with u=(x_i-x)/h. Option ckerbound="range" uses
sample bounds for a,b; ckerbound="fixed" uses user-supplied
bounds via ckerlb/ckerub (or the corresponding
cx*/cy* arguments). Infinite bounds recover the unbounded
kernel. This support-normalization strategy follows the same Racine-Li-Yan
finite-support normalization principle and is useful when data exhibit
non-negligible probability mass near boundaries.
Typical bounded-kernel calls:
## Unconditional density on [0,1]
bw <- npudensbw(dat=data.frame(x),
ckertype="gaussian",
ckerbound="fixed", ckerlb=0, ckerub=1)
## Regression with automatic sample-range bounds
bw <- npregbw(xdat=data.frame(x), ydat=y, ckerbound="range")
## Conditional density with separate x/y support controls
bw <- npcdensbw(xdat=data.frame(x), ydat=data.frame(y),
cxkerbound="fixed", cxkerlb=0, cxkerub=1,
cykerbound="range")
Unordered-categorical kernels (called via ukertype; for category
count c):
L_{AA}(x_i,x;\lambda)=\mathbf{1}(x_i=x)(1-\lambda)+\mathbf{1}(x_i\neq x)\frac{\lambda}{c-1}
(Aitchison-Aitken)
via ukertype="aitchisonaitken".
L_{LR,u}(x_i,x;\lambda)=\mathbf{1}(x_i=x)+\mathbf{1}(x_i\neq x)\lambda
(Li-Racine unordered kernel)
via ukertype="liracine".
Ordered-categorical kernels (called via okertype):
L_{WvR}(x_i,x;\lambda)=
\begin{cases}
1-\lambda, & x_i=x\\
\frac{1-\lambda}{2}\lambda^{|x_i-x|}, & x_i\neq x
\end{cases}
(Wang-van Ryzin)
via okertype="wangvanryzin".
L_{LR,o}(x_i,x;\lambda)=\lambda^{|x_i-x|}
(Li-Racine ordered kernel)
via okertype="liracine".
L_{NLR,o}(x_i,x;\lambda)=\lambda^{|x_i-x|}\frac{1-\lambda}{1+\lambda}
(normalized Li-Racine ordered kernel; used internally)
L_{RLY}(x_i,x;\lambda)=\frac{\lambda^{|x_i-x|}}{\sum_{z\in\mathcal{S}(x)}\lambda^{|x_i-z|}}
(Racine-Li-Yan ordered kernel, normalized on support \mathcal{S}(x)).
exposed as okertype="racineliyan".
These univariate kernels are combined as generalized product kernels over mixed data types in the estimators and cross-validation criteria.
Aitchison, J. and Aitken, C. G. G. (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413–420.
Wang, M. C. and Van Ryzin, J. (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301–309.
Li, Q. and Racine, J. S. (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
Racine, J. S. and Li, Q. (2004), “Nonparametric estimation of regression functions with both categorical and continuous data,” Journal of Econometrics, 119, 99–130.
Racine, J. S., Li, Q., and Yan, K. X. (2020), “Kernel Smoothed Probability Mass Functions for Ordered Datatypes,” Journal of Nonparametric Statistics, 32(3), 563–586. doi:10.1080/10485252.2020.1759595
Hall, P., Racine, J. S., and Li, Q. (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015–1026.
np.options, plot
npregbw,
npudensbw,
npudistbw,
npcdensbw,
npcdistbw,
npksum,
np.options.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.