np.kernels: Kernel Functions Used In 'np'

np.kernelsR Documentation

Kernel Functions Used In np

Description

Summary of continuous, unordered-categorical, and ordered-categorical kernels used by np (including higher-order continuous kernels and compact-support variants used in C-level code paths).

Details

Documentation guide: see np.options for global options and plot for plotting options.

Kernel option names used in np:

  • Continuous kernels: ckertype (and ckerorder, ckerbound where applicable).

  • Unordered kernels: ukertype.

  • Ordered kernels: okertype.

  • Conditional density/distribution bandwidth objects split kernel choices by response and regressor blocks: cykertype/cxkertype, uykertype/uxkertype, oykertype/oxkertype (with matching order/bound options for continuous kernels).

Let u = (x_i-x)/h for continuous variables.

Continuous kernels (called via ckertype):

K_{G,2}(u)=\phi(u)

K_{G,4}(u)=\left(\frac{3}{2}-\frac{1}{2}u^2\right)\phi(u)

K_{G,6}(u)=\left(\frac{15}{8}-\frac{5}{4}u^2+\frac{1}{8}u^4\right)\phi(u)

K_{G,8}(u)=\left(\frac{35}{16}-\frac{35}{16}u^2+\frac{7}{16}u^4-\frac{1}{48}u^6\right)\phi(u)

where \phi(u) is the standard normal density.

ckertype="gaussian" with ckerorder=2,4,6,8.

The compact-support Epanechnikov-family kernels implemented in C use support |u|<\sqrt{5}:

K_{E,2}(u)=\frac{3}{4\sqrt{5}}\left(1-\frac{u^2}{5}\right)\mathbf{1}(|u|<\sqrt{5})

K_{E,4}(u)=0.008385254916(-15+7u^2)(-5+u^2)\mathbf{1}(u^2<5)

K_{E,6}(u)=0.33541019662496845446\left(2.734375-3.28125u^2+0.721875u^4\right)\left(1-0.2u^2\right)\mathbf{1}(u^2<5)

K_{E,8}(u)=0.33541019662496845446\left(3.5888671875-7.8955078125u^2+4.1056640625u^4-0.5865234375u^6\right)\left(1-0.2u^2\right)\mathbf{1}(u^2<5)

ckertype="epanechnikov" with ckerorder=2,4,6,8.

Uniform (rectangular) kernel:

K_U(u)=\frac{1}{2}\mathbf{1}(|u|<1)

via ckertype="uniform" (order ignored).

Truncated-Gaussian (second-order) kernel via ckertype="truncated gaussian":

K_{TG,2}(u)=\left[\alpha\phi(u)-c_0\right]\mathbf{1}(|u|<b)

with defaults b=3 and internal constants calibrated in C.

Bounded continuous-kernel normalization (ckerbound and, for conditional objects, cxkerbound/cykerbound) reuses the selected continuous kernel and renormalizes it on the declared support. For a base kernel K and support [a,b], the bounded kernel is

K_{[a,b]}(u;x,h)=\frac{K(u)}{\int_{(a-x)/h}^{(b-x)/h}K(t)dt}

with u=(x_i-x)/h. Option ckerbound="range" uses sample bounds for a,b; ckerbound="fixed" uses user-supplied bounds via ckerlb/ckerub (or the corresponding cx*/cy* arguments). Infinite bounds recover the unbounded kernel. This support-normalization strategy follows the same Racine-Li-Yan finite-support normalization principle and is useful when data exhibit non-negligible probability mass near boundaries.

Typical bounded-kernel calls:

  ## Unconditional density on [0,1]
  bw <- npudensbw(dat=data.frame(x),
                  ckertype="gaussian",
                  ckerbound="fixed", ckerlb=0, ckerub=1)

  ## Regression with automatic sample-range bounds
  bw <- npregbw(xdat=data.frame(x), ydat=y, ckerbound="range")

  ## Conditional density with separate x/y support controls
  bw <- npcdensbw(xdat=data.frame(x), ydat=data.frame(y),
                  cxkerbound="fixed", cxkerlb=0, cxkerub=1,
                  cykerbound="range")
  

Unordered-categorical kernels (called via ukertype; for category count c):

L_{AA}(x_i,x;\lambda)=\mathbf{1}(x_i=x)(1-\lambda)+\mathbf{1}(x_i\neq x)\frac{\lambda}{c-1}

(Aitchison-Aitken) via ukertype="aitchisonaitken".

L_{LR,u}(x_i,x;\lambda)=\mathbf{1}(x_i=x)+\mathbf{1}(x_i\neq x)\lambda

(Li-Racine unordered kernel) via ukertype="liracine".

Ordered-categorical kernels (called via okertype):

L_{WvR}(x_i,x;\lambda)= \begin{cases} 1-\lambda, & x_i=x\\ \frac{1-\lambda}{2}\lambda^{|x_i-x|}, & x_i\neq x \end{cases}

(Wang-van Ryzin) via okertype="wangvanryzin".

L_{LR,o}(x_i,x;\lambda)=\lambda^{|x_i-x|}

(Li-Racine ordered kernel) via okertype="liracine".

L_{NLR,o}(x_i,x;\lambda)=\lambda^{|x_i-x|}\frac{1-\lambda}{1+\lambda}

(normalized Li-Racine ordered kernel; used internally)

L_{RLY}(x_i,x;\lambda)=\frac{\lambda^{|x_i-x|}}{\sum_{z\in\mathcal{S}(x)}\lambda^{|x_i-z|}}

(Racine-Li-Yan ordered kernel, normalized on support \mathcal{S}(x)). exposed as okertype="racineliyan".

These univariate kernels are combined as generalized product kernels over mixed data types in the estimators and cross-validation criteria.

References

Aitchison, J. and Aitken, C. G. G. (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413–420.

Wang, M. C. and Van Ryzin, J. (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301–309.

Li, Q. and Racine, J. S. (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Racine, J. S. and Li, Q. (2004), “Nonparametric estimation of regression functions with both categorical and continuous data,” Journal of Econometrics, 119, 99–130.

Racine, J. S., Li, Q., and Yan, K. X. (2020), “Kernel Smoothed Probability Mass Functions for Ordered Datatypes,” Journal of Nonparametric Statistics, 32(3), 563–586. doi:10.1080/10485252.2020.1759595

Hall, P., Racine, J. S., and Li, Q. (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015–1026.

See Also

np.options, plot npregbw, npudensbw, npudistbw, npcdensbw, npcdistbw, npksum, np.options.


np documentation built on May 3, 2026, 1:07 a.m.