# kare: Kernel Adaptive Regression Estimator In kader: Kernel Adaptive Density Estimation and Regression

## Description

Wrapper function which does some preparatory calculations and then calls the actual “workhorse” functions which do the main computations for kernel adaptive regression estimation of Eichner & Stute (2012). Finally, it structures and returns the obtained results. Summarizing information and technical details can be found in Eichner (2017).

## Usage

 1 2 kare(x.points, data, kernel = c("gaussian", "epanechnikov", "rectangular"), Sigma = seq(0.01, 10, length = 51), h = NULL, theta = NULL) 

## Arguments

 x.points Vector of location(s) at which the regression estimate is to be computed. data Data frame or list with one component named x which contains the vector of regressor values x_1, …, x_n and one named y which holds the vector of pertaining response values y_1, …, y_n (in the corresponding order) of the data from which the estimate is to be computed at the values given in x.points. Pairs (x_i, y_i) with NA or an infinite value in a least one of their elements are removed (and a warning is issued). kernel A character string naming the kernel to be used for the adaptive estimator. This must partially match one of "gaussian", "rectangular" or "epanechnikov", with default "gaussian", and may be abbreviated to a unique prefix. (Currently, this kernel is also used for the initial, non-adaptive Nadaraya-Watson regression estimator which enters into the estimators of bias and variance as described in the references.) Sigma Vector of value(s) of the scale parameter σ. If of length 1 no adaptation is performed. Otherwise considered as the grid over which the optimization of the adaptive method will be performed. Defaults to seq(0.01, 10, length = 51). h Numeric scalar for bandwidth h. Defaults to NULL and is then internally set to n^{-1/5}. theta Numeric scalar for value of location parameter θ. Defaults to NULL and is then internally set to the arithmetic mean of x_1, …, x_n.

## Value

If length(x.points) = 1, a list of eight components with the following names and meanings:

 x Scalar x-value in x.points at which the regression estimator was computed. y Estimated scalar value of m(x) at point in x.points. sigma.adap The found scalar minimizer of the MSE-estimator, i.e., the adaptive smoothing parameter value. msehat.min The found scalar minimum of the MSE-estimator. Sigma Vector with the σ-grid on which the minimization process was performed. Bn Vector with the estimator of bias on that σ-grid. Vn2 Ditto for the variance. MSE Ditto for the MSE.

If length(x.points) > 1, a list with the same component names as above, but then

 x Vector x.points with x-values at which the regression estimator was computed. y Vector of estimated values of m(x) at the x-values in x.points. sigma.adap Vector of the found minimizers of the MSE-estimator, i.e., the adaptive smoothing parameter values. msehat.min Vector of the found minima of the MSE-estimator. Sigma Vector with the σ-grid on which the minimization process was performed. Bn (length(Sigma) by length(x.points))-matrix with the estimated values of the bias on the σ-grid in their columns (which correspond to the x-values). Vn2 Ditto for the variance. MSE Ditto for the MSE.

## References

Eichner & Stute (2012) and Eichner (2017): see kader.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 require(stats) # Regression function: m <- function(x, x1 = 0, x2 = 8, a = 0.01, b = 0) { a * (x - x1) * (x - x2)^3 + b } # Note: For a few details on m() see examples in ?nadwat. x0 <- 5 # The point x_0 at which the MSE-optimal kernel adjusted # nonparametric estimation of m should take place. (Recall: for m's # default values a minimum is at 2, a point of inflection at 4, and # a saddle point an 8; an "arbitrary" point would, e.g., be at 5.) n <- 100 # Sample size. sdeps <- 1 # Std. dev. of the \epsilon_i: \sqrt(Var(Y|X=x)) # (here: constant in x). design.ctr <- x0 + 0.5 # "centre" and "scale" of the design, i.e., design.scl <- 1 # in the normal scenario below, expected value and # std. dev. of the distribution of the x_i's. set.seed(42) # To guarantee reproducibility. x <- rnorm(n, mean = design.ctr, sd = design.scl) # x_1, ..., x_n Y <- m(x) + rnorm(length(x), sd = sdeps) # Y_1, ..., Y_n data <- data.frame(x = x, y = Y) # Computing the kernel adaptive regression estimator values #********************************************************** x.points <- seq(-3.3 * design.scl, 3.3 * design.scl, length = 101) + design.ctr # x-grid on which to draw and estimate the regr. fct. m. Sigma <- seq(0.01, 10, length = 51) # \sigma-grid for minimization. fit <- kare(x.points = x0, data = data, Sigma = Sigma) ## Not run: # Grafical display for the current data set #****************************************** # Storing the curent settings of the graphics device # and changing its layout for the three plots to come: op <- par(mfrow = c(3, 1), mar = c(3, 3, 2, 0.1)+0.1, mgp = c(1.5, 0.5, 0), tcl = -0.3, cex.main = 2) # The scatter plot of the "raw data": plot(y ~ x, data = data, xlim = range(data$x, x.points), ylim = range(data$y, fit$y, na.rm = TRUE), main = bquote(n == .(n)), xlab = "x", ylab = "y") # The "true" regression function m: lines(x.points, m(x.points), lty = 2) # The MSE-optimal kernel adjusted nonparametric regression estimator # at x_0, i.e., the point (x_0, \hat m_n(x_0)): points(fit$x, fit$y, col = "red", pch = 4, cex = 2) # The legend for the "true" regression function m and for the point # (x_0, \hat m_n(x_0)): legend("topleft", lty = c(2, NA), pch = c(NA, 4), col = c("black", "red"), bty = "n", cex = 1.2, legend = c(as.expression(bquote(paste("m with ", sigma(paste(Y, "|", X == x)) == .(sdeps)))), as.expression(bquote(paste(hat(m)[n](x[0]), " at ", x[0] == .(x0)))))) # Visualizing the estimators of (Bias_n(sigma))^2 and # Var_n(sigma) at x0 on the sigma-grid: with(fit, matplot(Sigma, cbind(Bn*Bn, Vn2), type = "l", lty = 1:2, col = c("black", "red"), xlab = expression(sigma), ylab = "")) # The legend for (Bias_n(sigma))^2 and Var_n(sigma): legend("topleft", lty = 1:2, col = c("black", "red"), bty = "n", legend = c(expression(paste(widehat(plain(Bias))[n]^2, (sigma))), expression(widehat(plain(Var))[n](sigma))), cex = 1.2) # Visualizing the estimator of MSE_n(sigma) at x0 on the sigma-grid # together with the point indicating the detected minimum, and a legend: plot(fit$Sigma, fit$MSE, type = "l", xlab = expression(sigma), ylab = "") points(fit$sigma.adap, fit$msehat.min, pch = 4, col = "red", cex = 2) legend("topleft", lty = c(1, NA), pch = c(NA, 4), col = c("black", "red"), bty = "n", cex = 1.2, legend = c(expression(widehat(plain(MSE))[n](sigma)), substitute(group("(", list(plain(Minimizer), plain(Minimum)), ")") == group("(", list(x, y), ")"), list(x = signif(fit$sigma.adap, 4), y = signif(fit\$msehat.min, 4))))) par(op) # Restoring the previous settings of the graphics device. ## End(Not run) 

kader documentation built on May 1, 2019, 10:13 p.m.