Description Usage Arguments Details Value Note Author(s) References See Also Examples
The (S3) generic function dkde
computes the r'th
derivative of kernel density estimator for one-dimensional
data. Its default method does so with the given kernel
and bandwidth h for one-dimensional observations.
1 2 3 4 5 |
x |
the data from which the estimate is to be computed. |
y |
the points of the grid at which the density derivative is to be estimated; the defaults are τ * h outside of range(x), where τ = 4. |
deriv.order |
derivative order (scalar). |
h |
the smoothing bandwidth to be used, can also be a character
string giving a rule to choose the bandwidth, see |
kernel |
a character string giving the smoothing kernel to be used, with default
|
... |
further arguments for (non-default) methods. |
A simple estimator for the density derivative can be obtained by taking the derivative of the kernel density estimate. If the kernel K(x) is differentiable r times then the r'th density derivative estimate can be written as:
hat(f)(x;r) = n^-1 h^-(r+1) sum(K(x-X(i)/h),i=1...n)
where,
K(x;r) = d^r /d x^r K(x)
for r = 0, 1, 2, …
The following assumptions on the density f(x;r), the bandwidth h, and the kernel K(x):
The (r+2) derivative f(x;r+2) is continuous, square integrable and ultimately monotone.
lim_(n -- > Inf) h = 0 and lim_(n --> Inf) nh^(2r+1) = Inf i.e., as the number of samples n is increased h approaches zero at a rate slower than 1/n^{2r+1}.
K(x) >= 0 and int K(x) dx = 1. The kernel function is assumed to be symmetric about the origin i.e., int x k(x;r) dx = 0 for even r and has finite second moment i.e., mu(K(x)) = int x^2 K(x) dx < Inf.
Some theoretical properties of the estimator hat(f)(x;r) have been investigated, among others, by Bhattacharya (1967), Schuster (1969). Let us now turn to the statistical properties of estimator. We are interested in the mean squared error since it combines squared bias and variance.
The bias can be written as:
E[hat(f)(x;r)] - f(x;r) = 0.5 h^2 mu(K(x)) f(x;r+2) + o(h^2)
The variance of the estimator can be written as:
VAR(hat(f)(x;r)) = f(x) R(K(x;r)) / n h^(2r+1) + o(1/nh^(2r+1))
with, R(K(x;r)) = int K(x;r)^2 dx.
The MSE (Mean Squared Error) for kernel density derivative estimators can be written as:
MSE(hat(f)(x;r),f(x;r)) = f(x) R(K(x;r)) / nh^(2r+1) + 1/4 h^4 mu(K(x))^2 f(x;r+1)^2 + o(h^4 + 1/ nh^(2r+1))
It follows that the MSE-optimal bandwidth for estimating hat(f)(x;r), is of order n^(-1/2r+5). Therefore,
the estimation of hat(f)(x;1) requires a bandwidth of order n^-1/7 compared to the optimal n^-1/5
for estimating f(x) itself. It reveals the increasing difficulty in problems of estimating higher derivatives.
The MISE (Mean Integrated Squared Error) can be written as:
MISE(hat(f)(x;r),f(x;r))=AMISE(hat(f)(x;r),f(x;r)) + o(h^4 + 1/nh^(2r+1))
where,
AMISE(hat(f)(x;r),f(x;r)) = R(K(x;r))/n h^(2r+1) + 1/4 h^2 mu(K(x))^2 R(f(x;r+2))
with: R(f(x;r)) = int f(x;r)^2 dx.
The performance of kernel is measured by MISE or AMISE (Asymptotic MISE).
If the bandwidth h
is missing from dkde
, then the default bandwidth is
h.ucv(x,deriv.order,kernel)
(Unbiased cross-validation, see h.ucv
).
For more details see references.
x |
data points - same as input. |
data.name |
the deparsed name of the |
n |
the sample size after elimination of missing values. |
kernel |
name of kernel to use. |
deriv.order |
the derivative order to use. |
h |
the bandwidth value to use. |
eval.points |
the coordinates of the points where the density derivative is estimated. |
est.fx |
the estimated density derivative values. |
This function are available in other packages such as KernSmooth, sm,
np, GenKern and locfit if deriv.order=0
, and in ks package
for Gaussian kernel only if 0 <= deriv.order <= 10
.
Arsalane Chouaib Guidoum acguidoum@usthb.dz
Alekseev, V. G. (1972). Estimation of a probability density function and its derivatives. Mathematical notes of the Academy of Sciences of the USSR. 12 (5), 808–811.
Alexandre, B. T. (2009). Introduction to Nonparametric Estimation. Springer-Verlag, New York.
Bowman, A. W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.
Bhattacharya, P. K. (1967). Estimation of a probability density function and Its derivatives. Sankhya: The Indian Journal of Statistics, Series A, 29, 373–382.
Jeffrey, S. S. (1996). Smoothing Methods in Statistics. Springer-Verlag, New York.
Radhey, S. S. (1987). MISE of kernel estimates of a density and its derivatives. Statistics and Probability Letters, 5, 153–159.
Scott, D. W. (1992). Multivariate Density Estimation. Theory, Practice and Visualization. New York: Wiley.
Schuster, E. F. (1969) Estimation of a probability density function and its derivatives. The Annals of Mathematical Statistics, 40 (4), 1187–1195.
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC. London.
Stoker, T. M. (1993). Smoothing bias in density derivative estimation. Journal of the American Statistical Association, 88, 855–863.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. New York: Springer.
Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.
Wolfgang, H. (1991). Smoothing Techniques, With Implementation in S. Springer-Verlag, New York.
plot.dkde
, see density
in package "stats" if deriv.order = 0
, and kdde
in package ks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | ## EXAMPLE 1: Simple example of a Gaussian density derivative
x <- rnorm(100)
dkde(x,deriv.order=0) ## KDE of f
dkde(x,deriv.order=1) ## KDDE of d/dx f
dkde(x,deriv.order=2) ## KDDE of d^2/x^2 f
dkde(x,deriv.order=3) ## KDDE of d^3/x^3 f
dev.new()
par(mfrow=c(2,2))
plot(dkde(x,deriv.order=0))
plot(dkde(x,deriv.order=1))
plot(dkde(x,deriv.order=2))
plot(dkde(x,deriv.order=3))
## EXAMPLE 2: Bimodal Gaussian density derivative
## show the kernels in the dkde parametrization
fx <- function(x) 0.5 * dnorm(x,-1.5,0.5) + 0.5 * dnorm(x,1.5,0.5)
fx1 <- function(x) 0.5 *(-4*x-6)* dnorm(x,-1.5,0.5) + 0.5 *(-4*x+6) *
dnorm(x,1.5,0.5)
## 'h = 0.3' ; 'Derivative order = 0'
kernels <- eval(formals(dkde.default)$kernel)
dev.new()
plot(dkde(bimodal,h=0.3),sub=paste("Derivative order = 0",";",
"Bandwidth =0.3 "),ylim=c(0,0.5), main = "Bimodal Gaussian Density")
for(i in 2:length(kernels))
lines(dkde(bimodal, h = 0.3, kernel = kernels[i]), col = i)
curve(fx,add=TRUE,lty=8)
legend("topright", legend = c(TRUE,kernels), col = c("black",seq(kernels)),
lty = c(8,rep(1,length(kernels))),cex=0.7, inset = .015)
## 'h = 0.6' ; 'Derivative order = 1'
kernels <- eval(formals(dkde.default)$kernel)[-3]
dev.new()
plot(dkde(bimodal,deriv.order=1,h=0.6),main = "Bimodal Gaussian Density Derivative",sub=paste
("Derivative order = 1",";","Bandwidth =0.6"),ylim=c(-0.6,0.6))
for(i in 2:length(kernels))
lines(dkde(bimodal,deriv.order=1, h = 0.6, kernel = kernels[i]), col = i)
curve(fx1,add=TRUE,lty=8)
legend("topright", legend = c(TRUE,kernels), col = c("black",seq(kernels)),
lty = c(8,rep(1,length(kernels))),cex=0.7, inset = .015)
|
Data: x (100 obs.); Kernel: gaussian
Derivative order: 0; Bandwidth 'h' = 0.7463
eval.points est.fx
Min. :-5.608003 Min. :2.160e-06
1st Qu.:-2.807706 1st Qu.:1.215e-03
Median :-0.007408 Median :4.169e-02
Mean :-0.007408 Mean :8.910e-02
3rd Qu.: 2.792889 3rd Qu.:1.832e-01
Max. : 5.593186 Max. :2.604e-01
Data: x (100 obs.); Kernel: gaussian
Derivative order: 1; Bandwidth 'h' = 0.9993
eval.points est.fx
Min. :-6.619841 Min. :-9.282e-02
1st Qu.:-3.313625 1st Qu.:-2.670e-02
Median :-0.007408 Median : 9.230e-06
Mean :-0.007408 Mean : 4.000e-08
3rd Qu.: 3.298808 3rd Qu.: 2.655e-02
Max. : 6.605024 Max. : 9.304e-02
Data: x (100 obs.); Kernel: gaussian
Derivative order: 2; Bandwidth 'h' = 1.241
eval.points est.fx
Min. :-7.585840 Min. :-6.376e-02
1st Qu.:-3.796624 1st Qu.:-1.201e-03
Median :-0.007408 Median : 2.578e-03
Mean :-0.007408 Mean :-1.060e-06
3rd Qu.: 3.781807 3rd Qu.: 2.052e-02
Max. : 7.571023 Max. : 3.530e-02
Data: x (100 obs.); Kernel: gaussian
Derivative order: 3; Bandwidth 'h' = 1.468
eval.points est.fx
Min. :-8.495418 Min. :-3.636e-02
1st Qu.:-4.251413 1st Qu.:-8.903e-03
Median :-0.007408 Median : 4.610e-06
Mean :-0.007408 Mean : 2.100e-07
3rd Qu.: 4.236596 3rd Qu.: 9.049e-03
Max. : 8.480601 Max. : 3.639e-02
dev.new(): using pdf(file="Rplots1.pdf")
dev.new(): using pdf(file="Rplots2.pdf")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.