boot: Bootstrap a Fit Smooth

View source: R/boot.R

bootR Documentation

Bootstrap a Fit Smooth

Description

Bootstraps a fit nonparametric regression model to form confidence intervals (BCa or percentile) and standard error estimates.

Usage

## S3 method for class 'ss'
boot(object, statistic, ..., R = 9999, level = 0.95, bca = TRUE, 
     method = c("cases", "resid", "param"), fix.lambda = TRUE, cov.mat = FALSE, 
     boot.dist = FALSE, verbose = TRUE, parallel = FALSE, cl = NULL)

## S3 method for class 'sm'
boot(object, statistic, ..., R = 9999, level = 0.95, bca = TRUE, 
     method = c("cases", "resid", "param"), fix.lambda = TRUE, 
     fix.thetas = TRUE, cov.mat = FALSE, boot.dist = FALSE, 
     verbose = TRUE, parallel = FALSE, cl = NULL)
     
## S3 method for class 'gsm'
boot(object, statistic, ..., R = 9999, level = 0.95, bca = TRUE, 
     method = c("cases", "resid", "param"), fix.lambda = TRUE, 
     fix.thetas = TRUE, cov.mat = FALSE, boot.dist = FALSE, 
     verbose = TRUE, parallel = FALSE, cl = NULL)     

Arguments

object

a fit from ss (smoothing spline), sm (smooth model), or gsm (generalized smooth model)

statistic

a function to compute the statistic (see Details)

...

additional arguments to statistic function (optional)

R

number of bootstrap resamples used to form bootstrap distribution

level

confidence level for bootstrap confidence intervals

bca

logical indicating whether to calculate BCa (default) or percentile intervals

method

resampling method used to form bootstrap distribution

fix.lambda

logical indicating whether the smoothing parameter should be fixed (default) or re-estimated for each bootstrap sample

fix.thetas

logical indicating whether the "extra" smoothing parameters should be fixed (default) or re-estimated for each bootstrap sample. Only applicable to sm and gsm objects with multiple penalized terms.

cov.mat

logical indicating whether the bootstrap estimate of the covariance matrix should be returned

boot.dist

logical indicating whether the bootstrap distribution should be returned

verbose

logical indicating whether the bootstrap progress bar should be printed

parallel

logical indicating if the parallel package should be used for parallel computing (of the bootstrap distribution). Defaults to FALSE, which implements sequential computing.

cl

cluster for parallel computing, which is used when parallel = TRUE. Note that if parallel = TRUE and cl = NULL, then the cluster is defined as makeCluster(detectCores()).

Details

The statistic function must satisfy the following two requirements:

(1) the first input must be the object of class ss, sm, or gsm

(2) the output must be a scalar or vector calculated from the object

In most applications, the statistic function will be the model predictions at some user-specified newdata, which can be passed to statistic using the ... argument.

If statistic is not provided, then the function is internally defined to be the model predictions at an equidistance sequence (for ss objects) or the training data predictor scores (for sm and gsm objects).

Value

Produces an object of class 'boot.ss', 'boot.sm', or 'boot.gsm', with the following elements:

t0

Observed statistic, computed using statistic(object, ...)

se

Bootstrap estimate of the standard error

bias

Bootstrap estimate of the bias

cov

Bootstrap estimate of the covariance (if cov.mat = TRUE)

ci

Bootstrap estimate of the confidence interval

boot.dist

Bootstrap distribution of statistic (if boot.dist = TRUE)

bias.correct

Bias correction factor for BCa confidence interval.

acceleration

Acceleration parameter for BCa confidence interval.

The output list also contains the elements object, R, level, bca, method, fix.lambda, and fix.thetas, all of which are the same as the corresponding input arguments.

Note

For gsm objects, requesting method = "resid" uses a variant of the one-step technique described in Moulton and Zeger (1991), which forms the bootstrap estimates of the coefficients without refitting the model.

As a result, when bootstrapping gsm objects with method = "resid":

(1) it is necessary to set fix.lambda = TRUE and fix.thetas = TRUE

(2) any logical statistic must depend on the model coefficients, e.g., through the model predictions.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge University Press. doi: 10.1017/CBO9780511802843

Efron, B., & Tibshirani, R. J. (1994). An Introduction to the Boostrap. Chapman & Hall/CRC. doi: 10.1201/9780429246593

Moulton, L. H., & Zeger, S. L. (1991). Bootstrapping generalized linear models. Computational Statistics & Data Analysis, 11(1), 53-63. doi: 10.1016/0167-9473(91)90052-4

See Also

ss for fitting "ss" (smoothing spline) objects

sm for fitting "sm" (smooth model) objects

gsm for fitting "gsm" (generalized smooth model) objects

Examples

## Not run: 

##########   EXAMPLE 1   ##########
### smoothing spline

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit smoothing spline
ssfit <- ss(x, y, nknots = 10)

# nonparameteric bootstrap cases
set.seed(0)
boot.cases <- boot(ssfit)

# nonparameteric bootstrap residuals
set.seed(0)
boot.resid <- boot(ssfit, method = "resid")

# parameteric bootstrap residuals
set.seed(0)
boot.param <- boot(ssfit, method = "param")

# plot results
par(mfrow = c(1, 3))
plot(boot.cases, main = "Cases")
plot(boot.resid, main = "Residuals")
plot(boot.param, main = "Parametric")



##########   EXAMPLE 2   ##########
### smooth model

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit smoothing spline
smfit <- sm(y ~ x, knots = 10)

# define statistic (to be equivalent to boot.ss default)
newdata <- data.frame(x = seq(0, 1, length.out = 201))
statfun <- function(object, newdata) predict(object, newdata)

# nonparameteric bootstrap cases
set.seed(0)
boot.cases <- boot(smfit, statfun, newdata = newdata)

# nonparameteric bootstrap residuals
set.seed(0)
boot.resid <- boot(smfit, statfun, newdata = newdata, method = "resid")

# parameteric bootstrap residuals (R = 99 for speed)
set.seed(0)
boot.param <- boot(smfit, statfun, newdata = newdata, method = "param")
                   
# plot results
par(mfrow = c(1, 3))
plotci(newdata$x, boot.cases$t0, ci = boot.cases$ci, main = "Cases")
plotci(newdata$x, boot.resid$t0, ci = boot.resid$ci, main = "Residuals")
plotci(newdata$x, boot.param$t0, ci = boot.param$ci, main = "Parametric")



##########   EXAMPLE 3   ##########
### generalized smooth model

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit smoothing spline
gsmfit <- gsm(y ~ x, knots = 10)

# define statistic (to be equivalent to boot.ss default)
newdata <- data.frame(x = seq(0, 1, length.out = 201))
statfun <- function(object, newdata) predict(object, newdata)

# nonparameteric bootstrap cases
set.seed(0)
boot.cases <- boot(gsmfit, statfun, newdata = newdata)

# nonparameteric bootstrap residuals
set.seed(0)
boot.resid <- boot(gsmfit, statfun, newdata = newdata, method = "resid")

# parameteric bootstrap residuals
set.seed(0)
boot.param <- boot(gsmfit, statfun, newdata = newdata,  method = "param")
                   
# plot results
par(mfrow = c(1, 3))
plotci(newdata$x, boot.cases$t0, ci = boot.cases$ci, main = "Cases")
plotci(newdata$x, boot.resid$t0, ci = boot.resid$ci, main = "Residuals")
plotci(newdata$x, boot.param$t0, ci = boot.param$ci, main = "Parametric")

## End(Not run)


npreg documentation built on July 21, 2022, 1:06 a.m.

Related to boot in npreg...