# hcv: Cross-validatory choice of smoothing parameter In sm: Smoothing Methods for Nonparametric Regression and Density Estimation

## Description

This function uses the technique of cross-validation to select a smoothing parameter suitable for constructing a density estimate or nonparametric regression curve in one or two dimensions.

## Usage

 `1` ```hcv(x, y = NA, hstart = NA, hend = NA, ...) ```

## Arguments

 `x` a vector, or two-column matrix of data. If `y` is missing these are observations to be used in the construction of a density estimate. If `y` is present, these are the covariate values for a nonparametric regression. `y` a vector of response values for nonparametric regression. `hstart` the smallest value of the grid points to be used in an initial grid search for the value of the smoothing parameter. `hend` the largest value of the grid points to be used in an initial grid search for the value of the smoothing parameter. `...` other optional parameters are passed to the `sm.options` function, through a mechanism which limits their effect only to this call of the function. Those specifically relevant for this function are the following: `h.weights`, `ngrid`, `display`, `add`; see the documentation of `sm.options` for their description.

## Details

See Sections 2.4 and 4.5 of the reference below.

The two-dimensional case uses a smoothing parameter derived from a single value, scaled by the standard deviation of each component.

This function does not employ a sophisticated algorithm and some adjustment of the search parameters may be required for different sets of data. An initial estimate of the value of h which minimises the cross-validatory criterion is located from a grid search using values which are equally spaced on a log scale between `hstart` and `hend`. A quadratic approximation is then used to refine this initial estimate.

## Value

the value of the smoothing parameter which minimises the cross-validation criterion over the selected grid.

## Side Effects

If the minimising value is located at the end of the grid of search positions, or if some values of the cross-validatory criterion cannot be evaluated, then a warning message is printed. In these circumstances altering the values of `hstart` and `hend` may improve performance.

## Note

As from version 2.1 of the package, a similar effect can be obtained with the new function `h.select`, via `h.select(x, method="cv")`. Users are encouraged to adopt this route, since `hcv` might be not accessible directly in future releases of the package. When the sample size is large `hcv` uses the raw data while `h.select(x, method="cv")` uses binning. The latter is likely to produce a more stable choice for `h`.

## References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

## See Also

`h.select`, `hsj`, `hnorm`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16``` ```# Density estimation x <- rnorm(50) par(mfrow=c(1,2)) h.cv <- hcv(x, display="lines", ngrid=32) sm.density(x, h=hcv(x)) par(mfrow=c(1,1)) # Nonparametric regression x <- seq(0, 1, length = 50) y <- rnorm(50, sin(2 * pi * x), 0.2) par(mfrow=c(1,2)) h.cv <- hcv(x, y, display="lines", ngrid=32) sm.regression(x, y, h=hcv(x, y)) par(mfrow=c(1,1)) ```

### Example output  ```Package 'sm', version 2.2-5.6: type help(sm) for summary information
Warning message:
no DISPLAY variable so Tk is not available
```

sm documentation built on May 1, 2019, 8:06 p.m.