SBFitting: Iterative Smooth Backfitting Algorithm

View source: R/SBFitting.R

SBFittingR Documentation

Iterative Smooth Backfitting Algorithm

Description

Smooth backfitting procedure for nonparametric additive models

Usage

SBFitting(Y, x, X, h = NULL, K = "epan", supp = NULL)

Arguments

Y

An n-dimensional vector whose elements consist of scalar responses.

x

An N by d matrix whose column vectors consist of N vectors of estimation points for each component function.

X

An n by d matrix whose row vectors consist of multivariate predictors.

h

A d vector of bandwidths for kernel smoothing to estimate each component function.

K

A function object representing the kernel to be used in the smooth backfitting (default is 'epan', the the Epanechnikov kernel.).

supp

A d by 2 matrix whose row vectors consist of the lower and upper limits of estimation intervals for each component function (default is the d-dimensional unit rectangle of [0,1]).

Details

SBFitting fits component functions of additive models for a scalar response and a multivariate predictor based on the smooth backfitting algorithm proposed by Mammen et al. (1999), see also Mammen and Park (2006), Yu et al. (2008), Lee et al. (2010, 2012) and others. SBFitting only focuses on the locally constant smooth backfitting estimator for the multivariate predictor case. Note that the fitting in the special case of a univariate predictor is the same as that provided by a local constant kernel regression estimator (Nadaraya-Watson estimator). The local polynomial approach can be extended similarly (currently omitted). Support of the multivariate predictor is assumed to be a product of closed intervals. Users should designate an estimation support for the additive component function where modeling is restricted to subintervals of the domain (see Han et al., 2016). If one puts X in the argument for the estimation points x, SBFitting returns the estimated values of the conditional mean responses given the observed predictors.

Value

A list containing the following fields:

SBFit

An N by d matrix whose column vectors consist of the smooth backfitting component function estimators at the given estimation points.

mY

A scalar of centered part of the regression model.

NW

An N by d matrix whose column vectors consist of the Nadaraya-Watson marginal regression function estimators for each predictor component at the given estimation points.

mgnDens

An N by d matrix whose column vectors consist of the marginal kernel density estimators for each predictor component at the given estimation points.

jntDens

An N by N by d by d array representing the 2-dimensional joint kernel density estimators for all pairs of predictor components at the given estimation grid. For example, [,,j,k] of the object provides the 2-dimensional joint kernel density estimator of the (j,k)-component of predictor components at the corresponding N by N matrix of estimation grid.

itemNum

The iteration number that the smooth backfitting algorithm has stopped.

itemErr

The iteration error of the smooth backfitting algorithm that represents the maximum L2 distance among component functions in the last successive updates.

References

Mammen, E., Linton, O. and Nielsen, J. (1999), "The existence and asymptotic properties of a backfitting projection algorithm under weak conditions", Annals of Statistics, Vol.27, No.5, p.1443-1490.

Mammen, E. and Park, B. U. (2006), "A simple smooth backfitting method for additive models", Annals of Statistics, Vol.34, No.5, p.2252-2271.

Yu, K., Park, B. U. and Mammen, E. (2008), "Smooth backfitting in generalized additive models", Annals of Statistics, Vol.36, No.1, p.228-260.

Lee, Y. K., Mammen, E. and Park., B. U. (2010), "backfitting and smooth backfitting for additive quantile models", Vol.38, No.5, p.2857-2883.

Lee, Y. K., Mammen, E. and Park., B. U. (2012), "Flexible generalized varying coefficient regression models", Annals of Statistics, Vol.40, No.3, p.1906-1933.

Han, K., Müller, H.-G. and Park, B. U. (2016), "Smooth backfitting for additive modeling with small errors-in-variables, with an application to additive functional regression for multiple predictor functions", Bernoulli (accepted).

Examples

set.seed(100)

n <- 100
d <- 2
X <- pnorm(matrix(rnorm(n*d),nrow=n,ncol=d)%*%matrix(c(1,0.6,0.6,1),nrow=2,ncol=2))

f1 <- function(t) 2*(t-0.5)
f2 <- function(t) sin(2*pi*t)

Y <- f1(X[,1])+f2(X[,2])+rnorm(n,0,0.1)

# component function estimation
N <- 101
x <- matrix(rep(seq(0,1,length.out=N),d),nrow=N,ncol=d)
h <- c(0.12,0.08)
  
sbfEst <- SBFitting(Y,x,X,h)
fFit <- sbfEst$SBFit

op <- par(mfrow=c(1,2))
plot(x[,1],f1(x[,1]),type='l',lwd=2,col=2,lty=4,xlab='X1',ylab='Y')
points(x[,1],fFit[,1],type='l',lwd=2,col=1)
points(X[,1],Y,cex=0.3,col=8)
legend('topleft',legend=c('SBF','true'),col=c(1,2),lwd=2,lty=c(1,4),horiz=FALSE,bty='n')
abline(h=0,col=8)

plot(x[,2],f2(x[,2]),type='l',lwd=2,col=2,lty=4,xlab='X2',ylab='Y')
points(x[,2],fFit[,2],type='l',lwd=2,col=1)
points(X[,2],Y,cex=0.3,col=8)
legend('topright',legend=c('SBF','true'),col=c(1,2),lwd=2,lty=c(1,4),horiz=FALSE,bty='n')
abline(h=0,col=8)
par(op)

# prediction
x <- X
h <- c(0.12,0.08)
  
sbfPred <- SBFitting(Y,X,X,h)
fPred <- sbfPred$mY+apply(sbfPred$SBFit,1,'sum')

op <- par(mfrow=c(1,1))
plot(fPred,Y,cex=0.5,xlab='SBFitted values',ylab='Y')
abline(coef=c(0,1),col=8)
par(op)

hadjipantelis/tPACE documentation built on Aug. 16, 2022, 10:45 a.m.