hcv: Cross-validatory choice of smoothing parameter

Description Usage Arguments Details Value Side Effects Note References See Also Examples

Description

This function uses the technique of cross-validation to select a smoothing parameter suitable for constructing a density estimate or nonparametric regression curve in one or two dimensions.

Usage

1
hcv(x, y = NA, hstart = NA, hend = NA, ...) 

Arguments

x

a vector, or two-column matrix of data. If y is missing these are observations to be used in the construction of a density estimate. If y is present, these are the covariate values for a nonparametric regression.

y

a vector of response values for nonparametric regression.

hstart

the smallest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.

hend

the largest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.

...

other optional parameters are passed to the sm.options function, through a mechanism which limits their effect only to this call of the function. Those specifically relevant for this function are the following: h.weights, ngrid, display, add; see the documentation of sm.options for their description.

Details

See Sections 2.4 and 4.5 of the reference below.

The two-dimensional case uses a smoothing parameter derived from a single value, scaled by the standard deviation of each component.

This function does not employ a sophisticated algorithm and some adjustment of the search parameters may be required for different sets of data. An initial estimate of the value of h which minimises the cross-validatory criterion is located from a grid search using values which are equally spaced on a log scale between hstart and hend. A quadratic approximation is then used to refine this initial estimate.

Value

the value of the smoothing parameter which minimises the cross-validation criterion over the selected grid.

Side Effects

If the minimising value is located at the end of the grid of search positions, or if some values of the cross-validatory criterion cannot be evaluated, then a warning message is printed. In these circumstances altering the values of hstart and hend may improve performance.

Note

As from version 2.1 of the package, a similar effect can be obtained with the new function h.select, via h.select(x, method="cv"). Users are encouraged to adopt this route, since hcv might be not accessible directly in future releases of the package. When the sample size is large hcv uses the raw data while h.select(x, method="cv") uses binning. The latter is likely to produce a more stable choice for h.

References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

See Also

h.select, hsj, hnorm

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#  Density estimation

x <- rnorm(50)
par(mfrow=c(1,2))
h.cv <- hcv(x, display="lines", ngrid=32)
sm.density(x, h=hcv(x))
par(mfrow=c(1,1))

#  Nonparametric regression

x <- seq(0, 1, length = 50)
y <- rnorm(50, sin(2 * pi * x), 0.2)
par(mfrow=c(1,2))
h.cv <- hcv(x, y, display="lines", ngrid=32)
sm.regression(x, y, h=hcv(x, y))
par(mfrow=c(1,1))

Example output

Package 'sm', version 2.2-5.6: type help(sm) for summary information
Warning message:
no DISPLAY variable so Tk is not available 

sm documentation built on May 1, 2019, 8:06 p.m.

Related to hcv in sm...