cv.score: Leave-One-Curve-out Cross-Validation Score

Description Usage Arguments Details Value References See Also Examples

View source: R/cv.score.R

Description

Compute the cross-validation score of Rice and Silverman (1991) for the local polynomial estimation of a mean function.

Usage

1
cv.score(bandwidth, x, y, degree = 1, gridsize = length(x))

Arguments

bandwidth

kernel bandwidth.

x

observation points. Missing values are not accepted.

y

matrix or data frame with functional observations (= curves) stored in rows. The number of columns of y must match the length of x. Missing values are not accepted.

degree

degree of the local polynomial fit.

gridsize

size of evaluation grid for the smoothed data.

Details

The cross-validation score is obtained by leaving in turn each curve out and computing the prediction error of the local polynomial smoother based on all other curves. For a bandwith value h, this score is

CV(h) = ∑ (Y[ij] - μ^{-i}(x[j];h))^2 / (n*p)

where Y[ij] is the measurement of the i-th curve at location x[j] for i=1,…,n and j=1,…,p, and μ^{-i}(x[j];h) is the local polynomial estimator with bandwidth h based on all curves except the i-th.

If the x values are not equally spaced, the data are first smoothed and evaluated on a grid of length gridsize spanning the range of x. The smoothed data are then interpolated back to x.

Value

the cross-validation score.

References

Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society. Series B (Methodological), 53, 233–243.

See Also

cv.select

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## Artificial example 
x <- seq(0, 1, len = 100)
mu <- x + .2 * sin(2 * pi * x)
y <- matrix(mu + rnorm(2000, sd = .25), 20, 100, byrow = TRUE)
h <- c(.005, .01, .02, .05, .1, .15)
cv <- numeric()
for (i in 1:length(h)) cv[i] <- cv.score(h[i], x, y, 1)
plot(h, cv, type = "l")

## Plasma citrate data
## Compare cross-validation scores and bandwidths  
## for local linear and local quadratic smoothing
## Not run: 
data(plasma)
time <- 8:21
h1 <- seq(.5, 1.3, .05)
h2 <- seq(.75, 2, .05)
cv1 <- sapply(h1, cv.score, x = time, y = plasma, degree = 1)
cv2 <- sapply(h2, cv.score, x = time, y = plasma, degree = 2)
plot(h1, cv1, type = "l", xlim = range(c(h1,h2)), ylim = range(c(cv1, cv2)), 
  xlab = "Bandwidth (hour)", ylab = "CV score", 
  main = "Cross validation for local polynomial estimation")
lines(h2, cv2, col = 2)
legend("topleft", legend = c("Linear", "Quadratic"), lty = 1, 
  col = 1:2, cex = .9)

## Note: using local linear (resp. quadratic) smoothing 
## with a bandwidth smaller than .5 (resp. .75) can result 
## in non-definiteness or numerical instability of the estimator. 

## End(Not run)

Example output



SCBmeanfd documentation built on May 2, 2019, 4:19 a.m.