boot: Bootstrap a Fit Smooth In npreg: Nonparametric Regression via Smoothing Splines

 boot R Documentation

Bootstrap a Fit Smooth

Description

Bootstraps a fit nonparametric regression model to form confidence intervals (BCa or percentile) and standard error estimates.

Usage

```## S3 method for class 'ss'
boot(object, statistic, ..., R = 9999, level = 0.95, bca = TRUE,
method = c("cases", "resid", "param"), fix.lambda = TRUE, cov.mat = FALSE,
boot.dist = FALSE, verbose = TRUE, parallel = FALSE, cl = NULL)

## S3 method for class 'sm'
boot(object, statistic, ..., R = 9999, level = 0.95, bca = TRUE,
method = c("cases", "resid", "param"), fix.lambda = TRUE,
fix.thetas = TRUE, cov.mat = FALSE, boot.dist = FALSE,
verbose = TRUE, parallel = FALSE, cl = NULL)

## S3 method for class 'gsm'
boot(object, statistic, ..., R = 9999, level = 0.95, bca = TRUE,
method = c("cases", "resid", "param"), fix.lambda = TRUE,
fix.thetas = TRUE, cov.mat = FALSE, boot.dist = FALSE,
verbose = TRUE, parallel = FALSE, cl = NULL)
```

Arguments

 `object` a fit from `ss` (smoothing spline), `sm` (smooth model), or `gsm` (generalized smooth model) `statistic` a function to compute the statistic (see Details) `...` additional arguments to `statistic` function (optional) `R` number of bootstrap resamples used to form bootstrap distribution `level` confidence level for bootstrap confidence intervals `bca` logical indicating whether to calculate BCa (default) or percentile intervals `method` resampling method used to form bootstrap distribution `fix.lambda` logical indicating whether the smoothing parameter should be fixed (default) or re-estimated for each bootstrap sample `fix.thetas` logical indicating whether the "extra" smoothing parameters should be fixed (default) or re-estimated for each bootstrap sample. Only applicable to `sm` and `gsm` objects with multiple penalized terms. `cov.mat` logical indicating whether the bootstrap estimate of the covariance matrix should be returned `boot.dist` logical indicating whether the bootstrap distribution should be returned `verbose` logical indicating whether the bootstrap progress bar should be printed `parallel` logical indicating if the `parallel` package should be used for parallel computing (of the bootstrap distribution). Defaults to FALSE, which implements sequential computing. `cl` cluster for parallel computing, which is used when `parallel = TRUE`. Note that if `parallel = TRUE` and `cl = NULL`, then the cluster is defined as `makeCluster(detectCores())`.

Details

The `statistic` function must satisfy the following two requirements:

(1) the first input must be the `object` of class `ss`, `sm`, or `gsm`

(2) the output must be a scalar or vector calculated from the `object`

In most applications, the `statistic` function will be the model predictions at some user-specified `newdata`, which can be passed to `statistic` using the `...` argument.

If `statistic` is not provided, then the function is internally defined to be the model predictions at an equidistance sequence (for `ss` objects) or the training data predictor scores (for `sm` and `gsm` objects).

Value

Produces an object of class 'boot.ss', 'boot.sm', or 'boot.gsm', with the following elements:

 `t0 ` Observed statistic, computed using `statistic(object, ...)` `se ` Bootstrap estimate of the standard error `bias ` Bootstrap estimate of the bias `cov ` Bootstrap estimate of the covariance (if `cov.mat = TRUE`) `ci ` Bootstrap estimate of the confidence interval `boot.dist ` Bootstrap distribution of statistic (if `boot.dist = TRUE`) `bias.correct ` Bias correction factor for BCa confidence interval. `acceleration ` Acceleration parameter for BCa confidence interval.

The output list also contains the elements `object`, `R`, `level`, `bca`, `method`, `fix.lambda`, and `fix.thetas`, all of which are the same as the corresponding input arguments.

Note

For `gsm` objects, requesting `method = "resid"` uses a variant of the one-step technique described in Moulton and Zeger (1991), which forms the bootstrap estimates of the coefficients without refitting the model.

As a result, when bootstrapping `gsm` objects with `method = "resid"`:

(1) it is necessary to set `fix.lambda = TRUE` and `fix.thetas = TRUE`

(2) any logical `statistic` must depend on the model `coefficients`, e.g., through the model predictions.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge University Press. doi: 10.1017/CBO9780511802843

Efron, B., & Tibshirani, R. J. (1994). An Introduction to the Boostrap. Chapman & Hall/CRC. doi: 10.1201/9780429246593

Moulton, L. H., & Zeger, S. L. (1991). Bootstrapping generalized linear models. Computational Statistics & Data Analysis, 11(1), 53-63. doi: 10.1016/0167-9473(91)90052-4

`ss` for fitting "ss" (smoothing spline) objects

`sm` for fitting "sm" (smooth model) objects

`gsm` for fitting "gsm" (generalized smooth model) objects

Examples

```## Not run:

##########   EXAMPLE 1   ##########
### smoothing spline

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit smoothing spline
ssfit <- ss(x, y, nknots = 10)

# nonparameteric bootstrap cases
set.seed(0)
boot.cases <- boot(ssfit)

# nonparameteric bootstrap residuals
set.seed(0)
boot.resid <- boot(ssfit, method = "resid")

# parameteric bootstrap residuals
set.seed(0)
boot.param <- boot(ssfit, method = "param")

# plot results
par(mfrow = c(1, 3))
plot(boot.cases, main = "Cases")
plot(boot.resid, main = "Residuals")
plot(boot.param, main = "Parametric")

##########   EXAMPLE 2   ##########
### smooth model

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit smoothing spline
smfit <- sm(y ~ x, knots = 10)

# define statistic (to be equivalent to boot.ss default)
newdata <- data.frame(x = seq(0, 1, length.out = 201))
statfun <- function(object, newdata) predict(object, newdata)

# nonparameteric bootstrap cases
set.seed(0)
boot.cases <- boot(smfit, statfun, newdata = newdata)

# nonparameteric bootstrap residuals
set.seed(0)
boot.resid <- boot(smfit, statfun, newdata = newdata, method = "resid")

# parameteric bootstrap residuals (R = 99 for speed)
set.seed(0)
boot.param <- boot(smfit, statfun, newdata = newdata, method = "param")

# plot results
par(mfrow = c(1, 3))
plotci(newdata\$x, boot.cases\$t0, ci = boot.cases\$ci, main = "Cases")
plotci(newdata\$x, boot.resid\$t0, ci = boot.resid\$ci, main = "Residuals")
plotci(newdata\$x, boot.param\$t0, ci = boot.param\$ci, main = "Parametric")

##########   EXAMPLE 3   ##########
### generalized smooth model

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit smoothing spline
gsmfit <- gsm(y ~ x, knots = 10)

# define statistic (to be equivalent to boot.ss default)
newdata <- data.frame(x = seq(0, 1, length.out = 201))
statfun <- function(object, newdata) predict(object, newdata)

# nonparameteric bootstrap cases
set.seed(0)
boot.cases <- boot(gsmfit, statfun, newdata = newdata)

# nonparameteric bootstrap residuals
set.seed(0)
boot.resid <- boot(gsmfit, statfun, newdata = newdata, method = "resid")

# parameteric bootstrap residuals
set.seed(0)
boot.param <- boot(gsmfit, statfun, newdata = newdata,  method = "param")

# plot results
par(mfrow = c(1, 3))
plotci(newdata\$x, boot.cases\$t0, ci = boot.cases\$ci, main = "Cases")
plotci(newdata\$x, boot.resid\$t0, ci = boot.resid\$ci, main = "Residuals")
plotci(newdata\$x, boot.param\$t0, ci = boot.param\$ci, main = "Parametric")

## End(Not run)

```

npreg documentation built on July 21, 2022, 1:06 a.m.