local_poly: Local polynomial estimation

Description Usage Arguments Examples

View source: R/local_poly.R

Description

Estimate a regression function (or its derivative) using local polynomial estimation, essentially estimating a local Taylor series using locally weighted least squares. The function requires the user to specify a bandwidth, h.

Usage

1
2
3
4
5
6
7
8
9
local_poly(
  data,
  h,
  t = NULL,
  kernel = biweight,
  degree = 1,
  deriv = 0,
  empty_nhood = NaN
)

Arguments

data

the data used to fit the estimator. Must be a data frame with columns x and y, where x contains the design points x_1,…,x_n and y contains the response values Y_1,…,Y_n

h

a scalar giving the user-specified bandwidth (N.B. the cross-validation bandwidth can be computed using find_hcv)

t

(optional) a vector of points at which the estimator is evaluated. If unspecified, a sequence of 200 points is created that spans the range of the x-values in the data.

kernel

a kernel function. The package supplies uniform, gauss, epanechnikov and biweight (the default). If the support of the kernel is bounded, ensure it is scaled to [-1,1] to ensure correct plotting of any discontinuities.

degree

the degree p of local polynomial to use. Defaults to p=1 for local linear estimation.

deriv

if set to a positive integer, the function will estimate the rth derivative of the regression function, m^{(r)}(x). Defauls to zero, so that m(x) is estimated.

empty_nhood

a scalar specfying a custom value to be returned at locations where the estimator is undefined (as occurs when there are no nearby data points to average). Default is NaN.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  #  simulate and plot some data
  m <- function(x) (x^2+1)*sin(2*pi*x*((1-x) + 4*x))
  x <- sort(runif(100))
  y <- m(x) + rnorm(length(x), sd=0.1)
  simdata <- data.frame(x=x,y=y)
  plot(simdata)

  # calculate the estimator at x=0.1, with bandwidth 0.02
  local_poly(simdata,h=0.02,t=0.1)

  # a specialised print method has been provided to make life easier
  # however, we can still access the underlying numbers e.g.

  fit <- local_poly(simdata,h=0.02,t=0.1)
  fit$mhat
  print(fit) # the same output as before

  # plot the estimator with bandwidth 0.02 using default biweight kernel
  plot(local_poly(simdata,h=0.02))

   # add a line for the estimator with bandwidth 0.4
  lines(local_poly(simdata,h=0.4), col=2)

  # add a line for the estimator using Gaussian kernel
  lines(local_poly(simdata,h=0.02,kernel=gauss), col=4)

timwaite/nprtw documentation built on Jan. 25, 2021, 1:50 a.m.