local_average: Local average estimator

Description Usage Arguments Details Value Examples

View source: R/nw.R

Description

Estimate a regression function m(x) by local averaging. To calculate the estimate, we average the response values for design points within distance h of x. The quantity h is known as the bandwidth.

Usage

1
local_average(data, h, t = NULL, empty_nhood = NaN)

Arguments

data

the data used to fit the estimator. Must be a data frame with columns x and y, where x contains the design points x_1,…,x_n and y contains the response values Y_1,…,Y_n

h

a scalar giving the user-specified bandwidth (N.B. the cross-validation bandwidth can be computed using find_hcv)

t

(optional) a vector of points at which the estimator is evaluated. If unspecified, a sequence of 200 points is created that spans the range of the x-values in the data.

empty_nhood

a scalar specfying a custom value to be returned at locations where the estimator is undefined (as occurs when there are no nearby data points to average). Default is NaN.

Details

The function calls nw using the uniform kernel.

Value

An object of class npfit, which is a list with 5 items:

t

the vector of evaluation points

h

the bandwidth used

mhat

evaluations of the estimator \hat{m}(t_1),…, \hat{m}(t_n)

data

the data used to fit the estimator

A

the smoother matrix, such that \hat{m}=AY.

Specialised print, plot, and lines methods are available for these objects, to facilitate analysis. See examples below.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  # simulate and plot some data
  m <- function(x) (x^2+1)*sin(2*pi*x*((1-x) + 4*x))
  x <- sort(runif(100))
  y <- m(x) + rnorm(length(x), sd=0.1)
  simdata <- data.frame(x=x,y=y)
  plot(simdata)

  # calculate the estimator at x=0.1, with bandwidth 0.02
  local_average(simdata,h=0.02,t=0.1)

  # a specialised print method has been provided to make life easier
  # however, we can still access the underlying numbers e.g.

  fit <- local_average(simdata,h=0.02,t=0.1)
  fit$mhat
  print(fit) # the same output as before

  # plot the estimator with bandwidth 0.02
  plot(local_average(simdata,h=0.02))

   # add a line for the estimator with bandwidth 0.4
  lines(local_average(simdata,h=0.4), col=2)

  # NB the first plot is equivalent to the following:
  fit <- local_average(simdata,h=0.02)
  plot(fit$data)
  lines(fit$t,fit$mhat)

  # get smoother matrix
  fit$A

timwaite/nprtw documentation built on Jan. 25, 2021, 1:50 a.m.