nw: Nadaraya-Watson estimator

Description Usage Arguments Value Examples

View source: R/nw.R

Description

Estimate a regression function using the Nadaraya-Watson estimator, using a user-specified bandwidth h.

Usage

1
nw(data, h, t = NULL, kernel = biweight, empty_nhood = NaN)

Arguments

data

the data used to fit the estimator. Must be a data frame with columns x and y, where x contains the design points x_1,…,x_n and y contains the response values Y_1,…,Y_n

h

a scalar giving the user-specified bandwidth (N.B. the cross-validation bandwidth can be computed using find_hcv)

t

(optional) a vector of points at which the estimator is evaluated. If unspecified, a sequence of 200 points is created that spans the range of the x-values in the data.

kernel

a kernel function. The package supplies uniform, gauss, epanechnikov and biweight (the default). If the support of the kernel is bounded, ensure it is scaled to [-1,1] to ensure correct plotting of any discontinuities.

empty_nhood

a scalar specfying a custom value to be returned at locations where the estimator is undefined (as occurs when there are no nearby data points to average). Default is NaN.

Value

An object of class npfit, which is a list with 5 items:

t

the vector of evaluation points

h

the bandwidth used

mhat

evaluations of the estimator \hat{m}(t_1),…, \hat{m}(t_n)

data

the data used to fit the estimator

A

the smoother matrix, such that \hat{m}=AY.

Specialised print, plot, and lines methods are available for these objects, to facilitate analysis. See examples below.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
  # simulate and plot some data
  m <- function(x) (x^2+1)*sin(2*pi*x*((1-x) + 4*x))
  x <- sort(runif(100))
  y <- m(x) + rnorm(length(x), sd=0.1)
  simdata <- data.frame(x=x,y=y)
  plot(simdata)

  # calculate the estimator at x=0.1, with bandwidth 0.02
  nw(simdata,h=0.02,t=0.1)

  # a specialised print method has been provided to make life easier
  # however, we can still access the underlying numbers e.g.

  fit <- nw(simdata,h=0.02,t=0.1)
  fit$mhat
  print(fit) # the same output as before

  # plot the estimator with bandwidth 0.02 using default biweight kernel
  plot(nw(simdata,h=0.02))

   # add a line for the estimator with bandwidth 0.4
  lines(nw(simdata,h=0.4), col=2)

  # add a line for the estimator using Gaussian kernel
  lines(nw(simdata,h=0.02,kernel=gauss), col=4)

  # NB the first plot is equivalent to the following:
  fit <- nw(simdata,h=0.02)
  plot(fit$data)
  lines(fit$t,fit$mhat)

  # get smoother matrix
  fit$A

timwaite/nprtw documentation built on Jan. 25, 2021, 1:50 a.m.