np.kernels: Kernel Functions Used In 'np'
In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

np.kernels

R Documentation

Kernel Functions Used In np

Description

Summary of continuous, unordered-categorical, and ordered-categorical kernels used by np (including higher-order continuous kernels and compact-support variants used in C-level code paths).

Details

Documentation guide: see np.options for global options and plot for plotting options.

Kernel option names used in np:

Continuous kernels: ckertype (and ckerorder, ckerbound where applicable).
Unordered kernels: ukertype.
Ordered kernels: okertype.
Conditional density/distribution bandwidth objects split kernel choices by response and regressor blocks: cykertype/cxkertype, uykertype/uxkertype, oykertype/oxkertype (with matching order/bound options for continuous kernels).

Let u = (x_i-x)/h for continuous variables.

Continuous kernels (called via ckertype):

K_{G,2}(u)=\phi(u)

K_{G,4}(u)=\left(\frac{3}{2}-\frac{1}{2}u^2\right)\phi(u)

K_{G,6}(u)=\left(\frac{15}{8}-\frac{5}{4}u^2+\frac{1}{8}u^4\right)\phi(u)

K_{G,8}(u)=\left(\frac{35}{16}-\frac{35}{16}u^2+\frac{7}{16}u^4-\frac{1}{48}u^6\right)\phi(u)

where \phi(u) is the standard normal density.

ckertype="gaussian" with ckerorder=2,4,6,8.

The compact-support Epanechnikov-family kernels implemented in C use support |u|<\sqrt{5}:

K_{E,2}(u)=\frac{3}{4\sqrt{5}}\left(1-\frac{u^2}{5}\right)\mathbf{1}(|u|<\sqrt{5})

K_{E,4}(u)=0.008385254916(-15+7u^2)(-5+u^2)\mathbf{1}(u^2<5)

K_{E,6}(u)=0.33541019662496845446\left(2.734375-3.28125u^2+0.721875u^4\right)\left(1-0.2u^2\right)\mathbf{1}(u^2<5)

K_{E,8}(u)=0.33541019662496845446\left(3.5888671875-7.8955078125u^2+4.1056640625u^4-0.5865234375u^6\right)\left(1-0.2u^2\right)\mathbf{1}(u^2<5)

ckertype="epanechnikov" with ckerorder=2,4,6,8.

Uniform (rectangular) kernel:

K_U(u)=\frac{1}{2}\mathbf{1}(|u|<1)

via ckertype="uniform" (order ignored).

Truncated-Gaussian (second-order) kernel via ckertype="truncated gaussian":

K_{TG,2}(u)=\left[\alpha\phi(u)-c_0\right]\mathbf{1}(|u|<b)

with defaults b=3 and internal constants calibrated in C.

Bounded continuous-kernel normalization (ckerbound and, for conditional objects, cxkerbound/cykerbound) reuses the selected continuous kernel and renormalizes it on the declared support. For a base kernel K and support [a,b], the bounded kernel is

K_{[a,b]}(u;x,h)=\frac{K(u)}{\int_{(a-x)/h}^{(b-x)/h}K(t)dt}

with u=(x_i-x)/h. Option ckerbound="range" uses sample bounds for a,b; ckerbound="fixed" uses user-supplied bounds via ckerlb/ckerub (or the corresponding cx*/cy* arguments). Infinite bounds recover the unbounded kernel. This support-normalization strategy follows the same Racine-Li-Yan finite-support normalization principle and is useful when data exhibit non-negligible probability mass near boundaries.

Computationally, let H_K(r)=\int_0^r K(t)dt and S_K(r)=H_K(r)/r, with the continuous value S_K(0)=K(0). For finite bounds the implementation evaluates the scaled normalization mass directly as

hD(x,h)=(x-a)S_K((x-a)/h)+(b-x)S_K((b-x)/h).

Each supported kernel uses its analytic centered primitive. This avoids subtracting nearly equal whole-kernel CDF values and avoids constructing a normalization mass that vanishes numerically as h grows. Consequently, on finite [a,b] the bounded density kernel converges to the uniform density 1/(b-a) as h\to\infty. One-sided infinite bounds use the corresponding exact half-mass term; two infinite bounds retain the ordinary unbounded kernel path.

For distribution estimation, finite bounds use the distinct observation-centered truncated-kernel CDF

B_i(y;h)=\frac{G((y-Y_i)/h)-G((a-Y_i)/h)} {G((b-Y_i)/h)-G((a-Y_i)/h)},\quad y\in[a,b].

Thus B_i(a;h)=0, B_i(b;h)=1, and B_i(y;h)\to(y-a)/(b-a) as h\to\infty. The implementation evaluates centered analytic interval masses rather than subtracting nearly equal whole-kernel CDF values. This distribution operator is not the integral of the evaluation-centered bounded density kernel above. For nonnegative second-order kernels each contribution is monotone and lies in [0,1]; higher-order signed kernels retain their usual possible interior nonmonotonicity or overshoot.

Typical bounded-kernel calls:

  ## Unconditional density on [0,1]
  bw <- npudensbw(dat=data.frame(x),
                  ckertype="gaussian",
                  ckerbound="fixed", ckerlb=0, ckerub=1)

  ## Regression with automatic sample-range bounds
  bw <- npregbw(xdat=data.frame(x), ydat=y, ckerbound="range")

  ## Conditional density with separate x/y support controls
  bw <- npcdensbw(xdat=data.frame(x), ydat=data.frame(y),
                  cxkerbound="fixed", cxkerlb=0, cxkerub=1,
                  cykerbound="range")

Unordered-categorical kernels (called via ukertype; for category count c):

L_{AA}(x_i,x;\lambda)=\mathbf{1}(x_i=x)(1-\lambda)+\mathbf{1}(x_i\neq x)\frac{\lambda}{c-1}

(Aitchison-Aitken) via ukertype="aitchisonaitken".

L_{LR,u}(x_i,x;\lambda)=\mathbf{1}(x_i=x)+\mathbf{1}(x_i\neq x)\lambda

(Li-Racine unordered kernel) via ukertype="liracine".

Ordered-categorical kernels (called via okertype):

L_{WvR}(x_i,x;\lambda)= \begin{cases} 1-\lambda, & x_i=x\\ \frac{1-\lambda}{2}\lambda^{|x_i-x|}, & x_i\neq x \end{cases}

(Wang-van Ryzin) via okertype="wangvanryzin".

L_{LR,o}(x_i,x;\lambda)=\lambda^{|x_i-x|}

(Li-Racine ordered kernel) via okertype="liracine".

L_{NLR,o}(x_i,x;\lambda)=\lambda^{|x_i-x|}\frac{1-\lambda}{1+\lambda}

(normalized Li-Racine ordered kernel; used internally)

L_{RLY}(x_i,x;\lambda)=\frac{\lambda^{|x_i-x|}}{\sum_{z\in\mathcal{S}(x)}\lambda^{|x_i-z|}}

(Racine-Li-Yan ordered kernel, normalized on support \mathcal{S}(x)). exposed as okertype="racineliyan".

These univariate kernels are combined as generalized product kernels over mixed data types in the estimators and cross-validation criteria.

References

Aitchison, J. and Aitken, C. G. G. (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413–420.

Wang, M. C. and Van Ryzin, J. (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301–309.

Li, Q. and Racine, J. S. (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Racine, J. S. and Li, Q. (2004), “Nonparametric estimation of regression functions with both categorical and continuous data,” Journal of Econometrics, 119, 99–130.

Racine, J. S., Li, Q., and Yan, K. X. (2020), “Kernel Smoothed Probability Mass Functions for Ordered Datatypes,” Journal of Nonparametric Statistics, 32(3), 563–586. doi:10.1080/10485252.2020.1759595

Hall, P., Racine, J. S., and Li, Q. (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015–1026.

np
Nonparametric Kernel Smoothing Methods for Mixed Data Types

np.kernels: Kernel Functions Used In 'np'
In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

Kernel Functions Used In np

Description

Details

References

See Also

Related to np.kernels in np...

R Package Documentation

Browse R Packages

We want your feedback!

np Nonparametric Kernel Smoothing Methods for Mixed Data Types

np.kernels: Kernel Functions Used In 'np' In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types

Kernel Functions Used In np

Description

Details

References

See Also

Related to np.kernels in np...

R Package Documentation

Browse R Packages

We want your feedback!

np
Nonparametric Kernel Smoothing Methods for Mixed Data Types

np.kernels: Kernel Functions Used In 'np'
In np: Nonparametric Kernel Smoothing Methods for Mixed Data Types