h.bcv: Biased Cross-Validation for Bandwidth Selection

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

The (S3) generic function h.bcv computes the biased cross-validation bandwidth selector of r'th derivative of kernel density estimator one-dimensional.

Usage

1
2
3
4
5
h.bcv(x, ...)
## Default S3 method:
h.bcv(x, whichbcv = 1, deriv.order = 0, lower = 0.1 * hos, upper = 2 * hos, 
         tol = 0.1 * lower, kernel = c("gaussian","epanechnikov",
         "triweight","tricube","biweight","cosine"), ...)

Arguments

x

vector of data values.

whichbcv

method selected, 1 = BCV1 or 2 = BCV2, see details.

deriv.order

derivative order (scalar).

lower, upper

range over which to minimize. The default is almost always satisfactory. hos (Over-smoothing) is calculated internally from an kernel, see details.

tol

the convergence tolerance for optimize.

kernel

a character string giving the smoothing kernel to be used, with default "gaussian".

...

further arguments for (non-default) methods.

Details

h.bcv biased cross-validation implements for choosing the bandwidth h of a r'th derivative kernel density estimator. if whichbcv = 1 then BCV1 is selected (Scott and George 1987), and if whichbcv = 2 used BCV2 (Jones and Kappenman 1991).

Scott and George (1987) suggest a method which has as its immediate target the AMISE (e.g. Silverman 1986, section 3.3). We denote hat(theta)(h;r) and bar(theta)(h;r) (Peter and Marron 1987, Jones and Kappenman 1991) by:

hat(theta)(h;r) = (-1)^r / n(n-1) h^(2r+1) sum(sum(K(.;r)*K(.;r)(x(j)-x(i)/h)), i=1...n, j=1...n, j != i)

and

bar(theta)(h;r) = (-1)^r / n(n-1) h^(2r+1) sum(sum(K((x(j)-x(i)/h);2r)), i=1...n, j=1...n, j != i)

Scott and George (1987) proposed using hat(theta)(h;r) to estimate f(x;r). Thus, h(r)_(BCV1), say, is the h that minimises:

BCV1(h;r) = R(K(x;r))/ n h^(2r+1) + 0.25 mu(K(x))^2 h^4 hat(theta)(h;r+2)

and we define h(r)_(BCV2) as the minimiser of (Jones and Kappenman 1991):

BCV2(h;r) = R(K(x;r))/ n h^(2r+1) + 0.25 mu(K(x))^2 h^4 bar(theta)(h;r+2)

where K(x;r)*K(x;r) is the convolution of the r'th derivative kernel function K(x;r) (see kernel.conv and kernel.fun); R(K(x;r)) = int K(x;r)^2 dx and mu(K(x)) = int x^2 K(x) dx.

The range over which to minimize is hos Oversmoothing bandwidth, the default is almost always satisfactory. See George and Scott (1985), George (1990), Scott (1992, pp 165), Wand and Jones (1995, pp 61).

Value

x

data points - same as input.

data.name

the deparsed name of the x argument.

n

the sample size after elimination of missing values.

kernel

name of kernel to use

deriv.order

the derivative order to use.

whichbcv

method selected.

h

value of bandwidth parameter.

min.bcv

the minimal BCV value.

Author(s)

Arsalane Chouaib Guidoum acguidoum@usthb.dz

References

Jones, M. C. and Kappenman, R. F. (1991). On a class of kernel density estimate bandwidth selectors. Scandinavian Journal of Statistics, 19, 337–349.

Jones, M. C., Marron, J. S. and Sheather,S. J. (1996). A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association, 91, 401–407.

Peter, H. and Marron, J.S. (1987). Estimation of integrated squared density derivatives. Statistics and Probability Letters, 6, 109–115.

Scott, D.W. and George, R. T. (1987). Biased and unbiased cross-validation in density estimation. Journal of the American Statistical Association, 82, 1131–1146.

Sheather,S. J. (2004). Density estimation. Statistical Science, 19, 588–597.

Tarn, D. (2007). ks: Kernel density estimation and kernel discriminant analysis for multivariate data in R. Journal of Statistical Software, 21(7), 1–16.

Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.

Wolfgang, H. (1991). Smoothing Techniques, With Implementation in S. Springer-Verlag, New York.

See Also

plot.h.bcv, see bw.bcv in package "stats" and bcv in package MASS for Gaussian kernel only if deriv.order = 0, Hbcv for bivariate data in package ks for Gaussian kernel only if deriv.order = 0, kdeb in package locfit if deriv.order = 0.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## EXAMPLE 1:

x <- rnorm(100)
h.bcv(x,whichbcv = 1, deriv.order = 0)
h.bcv(x,whichbcv = 2, deriv.order = 0)

## EXAMPLE 2:

## Derivative order = 0

h.bcv(kurtotic,deriv.order = 0)

## Derivative order = 1

h.bcv(kurtotic,deriv.order = 1)

Example output

Call:		Biased Cross-Validation 1

Derivative order = 0
Data: x (100 obs.);	Kernel: gaussian
Min BCV = 0.00623253;	Bandwidth 'h' = 0.5477454


Call:		Biased Cross-Validation 2

Derivative order = 0
Data: x (100 obs.);	Kernel: gaussian
Min BCV = 0.004457027;	Bandwidth 'h' = 0.4619819


Call:		Biased Cross-Validation 1

Derivative order = 0
Data: kurtotic (200 obs.);	Kernel: gaussian
Min BCV = 0.01403079;	Bandwidth 'h' = 0.6190886


Call:		Biased Cross-Validation 1

Derivative order = 1
Data: kurtotic (200 obs.);	Kernel: gaussian
Min BCV = 0.02761328;	Bandwidth 'h' = 0.9717073

kedd documentation built on May 2, 2019, 7:32 a.m.