np.cv: Cross-validation bandwidth selection in nonparametric...
In PLRModels: Statistical Inference in Partial Linear Regression Models

np.cv

R Documentation

Cross-validation bandwidth selection in nonparametric regression models

Description

From a sample {(Y_i, t_i): i=1,...,n}, this routine computes, for each l_n considered, an optimal bandwidth for estimating m in the regression model

Y_i= m(t_i) + \epsilon_i.

The regression function, m, is a smooth but unknown function, and the random errors, {\epsilon_i}, are allowed to be time series. The optimal bandwidth is selected by means of the leave-(2l_n + 1)-out cross-validation procedure. Kernel smoothing is used.

Usage

np.cv(data = data, h.seq = NULL, num.h = 50, w = NULL, num.ln = 1, 
ln.0 = 0, step.ln = 2, estimator = "NW", kernel = "quadratic")

Arguments

`data`	`data[, 1]` contains the values of the response variable, `Y`; `data[, 2]` contains the values of the explanatory variable, `t`.
`h.seq`	sequence of considered bandwidths in the CV function. If `NULL` (the default), `num.h` equidistant values between zero and a quarter of the range of `t_i` are considered.
`num.h`	number of values used to build the sequence of considered bandwidths. If `h.seq` is not `NULL`, `num.h=length(h.seq)`. Otherwise, the default is 50.
`w`	support interval of the weigth function in the CV function. If `NULL` (the default), `(q_{0.1}, q_{0.9})` is considered, where `q_p` denotes the quantile of order `p` of `{t_i}`.
`num.ln`	number of values for `l_n`: `2l_{n} + 1` observations around each point `t_i` are eliminated to estimate `m(t_i)` in the CV function. The default is 1.
`ln.0`	minimum value for `l_n`. The default is 0.
`step.ln`	distance between two consecutives values of `l_n`. The default is 2.
`estimator`	allows us the choice between “NW” (Nadaraya-Watson) or “LLP” (Local Linear Polynomial). The default is “NW”.
`kernel`	allows us the choice between “gaussian”, “quadratic” (Epanechnikov kernel), “triweight” or “uniform” kernel. The default is “quadratic”.

Details

A weight function (specifically, the indicator function 1_{[w[1] , w[2]]}) is introduced in the CV function to allow elimination (or at least significant reduction) of boundary effects from the estimate of m(t_i).

For more details, see Chu and Marron (1991).

Value

`h.opt`	dataframe containing, for each `ln` considered, the selected value for the bandwidth.
`CV.opt`	`CV.opt[k]` is the minimum value of the CV function when de k-th value of `ln` is considered.
`CV`	matrix containing the values of the CV function for each bandwidth and `ln` considered.
`w`	support interval of the weigth function in the CV function.
`h.seq`	sequence of considered bandwidths in the CV function.

Author(s)

German Aneiros Perez ganeiros@udc.es

Ana Lopez Cheda ana.lopez.cheda@udc.es

References

Chu, C-K and Marron, J.S. (1991) Comparison of two bandwidth selectors with dependent errors. The Annals of Statistics 19, 1906-1918.

Examples

# EXAMPLE 1: REAL DATA
data <- matrix(10,120,2)
data(barnacles1)
barnacles1 <- as.matrix(barnacles1)
data[,1] <- barnacles1[,1]
data <- diff(data, 12)
data[,2] <- 1:nrow(data)

aux <- np.cv(data, ln.0=1,step.ln=1, num.ln=2)
aux$h.opt
plot.ts(aux$CV)

par(mfrow=c(2,1))
plot(aux$h.seq,aux$CV[,1], xlab="h", ylab="CV", type="l", main="ln=1")
plot(aux$h.seq,aux$CV[,2], xlab="h", ylab="CV", type="l", main="ln=2")



# EXAMPLE 2: SIMULATED DATA
## Example 2a: independent data

set.seed(1234)
# We generate the data
n <- 100
t <- ((1:n)-0.5)/n
m <- function(t) {0.25*t*(1-t)}
f <- m(t)

epsilon <- rnorm(n, 0, 0.01)
y <-  f + epsilon
data_ind <- matrix(c(y,t),nrow=100)

# We apply the function
a <-np.cv(data_ind)
a$CV.opt

CV <- a$CV
h <- a$h.seq
plot(h,CV,type="l")

PLRModels documentation built on Aug. 19, 2023, 5:10 p.m.

PLRModels index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

PLRModels
Statistical Inference in Partial Linear Regression Models

np.cv: Cross-validation bandwidth selection in nonparametric...
In PLRModels: Statistical Inference in Partial Linear Regression Models

Cross-validation bandwidth selection in nonparametric regression models

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to np.cv in PLRModels...

R Package Documentation

Browse R Packages

We want your feedback!

PLRModels Statistical Inference in Partial Linear Regression Models

np.cv: Cross-validation bandwidth selection in nonparametric... In PLRModels: Statistical Inference in Partial Linear Regression Models

Cross-validation bandwidth selection in nonparametric regression models

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to np.cv in PLRModels...

R Package Documentation

Browse R Packages

We want your feedback!

PLRModels
Statistical Inference in Partial Linear Regression Models

np.cv: Cross-validation bandwidth selection in nonparametric...
In PLRModels: Statistical Inference in Partial Linear Regression Models