kreg: Kernel regression

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Calculates a kernel regression estimate (univariate or multivariate).

Usage

1
2
kreg(x, y, bandwidth = NULL, grid = TRUE, kernel = "biweight",
     product = TRUE, sort = TRUE)

Arguments

x

n x d matrix, data

y

n x 1 vector, responses

bandwidth

scalar or 1 x d, bandwidth(s)

grid

logical or m x d matrix (where to calculate the regression)

kernel

text string, see kernel.function

product

(if d>1) product or spherical kernel

sort

logical, TRUE if data need to be sorted

Details

The estimator is calculated by Nadaraya-Watson kernel regression. Future extension to local linear (d>1) or polynomial (d=1) estimates is planned. The default bandwidth is computed by Scott's rule of thumb for kde (adapted to the chosen kernel function).

Value

List with components:

x

m x d matrix, where regression has been calculated

y

m x 1 vector, regression estimates

bandwidth

bandwidth used for calculation

df.residual

approximate degrees of freedom (residuals)

rearrange

if sort=TRUE, index to rearrange x and y to its original order.

Author(s)

Marlene Mueller

See Also

kernel.function, convol, kde

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
  n <- 1000
  x <- rnorm(n)
  m <- sin(x)
  y <- m + rnorm(n)
  plot(x,y,col="gray")
  o <- order(x); lines(x[o],m[o],col="green")
  lines(kreg(x,y),lwd=2)

  ## two-dimensional
  n <- 100
  x <- 6*cbind(runif(n), runif(n))-3
  m <- function(x1,x2){ 4*sin(x1) + x2 }
  y <- m(x[,1],x[,2]) + rnorm(n)
  mh <- kreg(x,y)##,bandwidth=1)

  grid1 <- unique(mh$x[,1])
  grid2 <- unique(mh$x[,2])
  est.m  <- t(matrix(mh$y,length(grid1),length(grid2)))
  orig.m <- outer(grid1,grid2,m)
  par(mfrow=c(1,2))
  persp(grid1,grid2,orig.m,main="Original Function",
        theta=30,phi=30,expand=0.5,col="lightblue",shade=0.5)
  persp(grid1,grid2,est.m,main="Estimated Function",
	theta=30,phi=30,expand=0.5,col="lightblue",shade=0.5)
  par(mfrow=c(1,1))
  
  ## now with normal x, note the boundary problem,
  ## which can be somewhat reduced by a gaussian kernel
  n <- 1000
  x <- cbind(rnorm(n), rnorm(n))
  m <- function(x1,x2){ 4*sin(x1) + x2 }
  y <- m(x[,1],x[,2]) + rnorm(n)
  mh <- kreg(x,y)##,p="gaussian")

  grid1 <- unique(mh$x[,1])
  grid2 <- unique(mh$x[,2])
  est.m  <- t(matrix(mh$y,length(grid1),length(grid2)))
  orig.m <- outer(grid1,grid2,m)
  par(mfrow=c(1,2))
  persp(grid1,grid2,orig.m,main="Original Function",
        theta=30,phi=30,expand=0.5,col="lightblue",shade=0.5)
  persp(grid1,grid2,est.m,main="Estimated Function",
	theta=30,phi=30,expand=0.5,col="lightblue",shade=0.5)
  par(mfrow=c(1,1))

gplm documentation built on May 2, 2019, 2:10 a.m.