# smooth.influence: Nonparametric Regression Diagnostics In npreg: Nonparametric Regression via Smoothing Splines

 smooth.influence R Documentation

## Nonparametric Regression Diagnostics

### Description

These functions provide the basic quantities that are used to form a variety of diagnostics for checking the quality of a fit smoothing spline (fit by `ss`), smooth model (fit by `sm`), or generalized smooth model (fit by `gsm`).

### Usage

```## S3 method for class 'ss'
influence(model, do.coef = TRUE, ...)
## S3 method for class 'sm'
influence(model, do.coef = TRUE, ...)
## S3 method for class 'gsm'
influence(model, do.coef = TRUE, ...)

smooth.influence(model, do.coef = TRUE)
```

### Arguments

 `model` an object of class "gsm" output by the `gsm` function, "sm" output by the `sm` function, or "ss" output by the `ss` function `do.coef` logical indicating if the changed `coefficients` are desired (see Details). `...` additional arguments (currently ignored)

### Details

Inspired by `influence` and `lm.influence` functions in R's stats package.

The functions documented in `smooth.influence.measures` provide a more user-friendly way of computing a variety of regression diagnostics.

For non-Gaussian `gsm` objects, these regression diagnostics are based on one-step approximations, which may be inadequate if a case has high influence.

For all models, the diagostics are computed assuming that the smoothing parameters are fixed at the given values.

### Value

A list with the components

 `hat ` a vector containing the leverages, i.e., the diagonals of the smoothing matrix `coefficients ` if `do.coef` is true, a matrix whose i-th row contains the change in the estimated coefficients which results when the i-th case is excluded from the fitting. `deviance ` a vector whose i-th entry contains the deviance which results when the i-th case is excluded from the fitting. `df ` a vector whose i-th entry contains the effective degrees-of-freedom which results when the i-th case is excluded from the fitting. `sigma ` a vector whose i-th element contains the estimate of the residual standard deviation obtained when the i-th case is excluded from the fitting. `wt.res ` a vector of weighted (or for class `gsm` rather deviance) residuals.

### Warning

The approximations used for `gsm` objects can result in `sigma` estimates being `NaN`.

### Note

The `coefficients` returned by `smooth.influence` (and the corresponding functions S3 `influence` methods) are the change in the coefficients which result from dropping each case, i.e., θ - θ_i, where θ are the original coefficients obtained from the full sample of n observations and θ_i are the coefficients that result from dropping the i-th case.

### Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

### References

See the list in the documentation for `influence.measures`

Chambers, J. M. (1992) Linear models. Chapter 4 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

`ss`, `sm`, `gsm` for modeling functions

`smooth.influence.measures` for convenient summary

`diagnostic.plots` for regression diagnostic plots

### Examples

```# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit models
mod.ss <- ss(x, y, nknots = 10)
mod.sm <- sm(y ~ x, knots = 10)
mod.gsm <- gsm(y ~ x, knots = 10)

# calculate influence
infl.ss <- influence(mod.ss)
infl.sm <- influence(mod.sm)
infl.gsm <- influence(mod.gsm)

# compare hat
mean((infl.ss\$hat - infl.sm\$hat)^2)
mean((infl.ss\$hat - infl.gsm\$hat)^2)
mean((infl.sm\$hat - infl.gsm\$hat)^2)

# compare deviance
mean((infl.ss\$deviance - infl.sm\$deviance)^2)
mean((infl.ss\$deviance - infl.gsm\$deviance)^2)
mean((infl.sm\$deviance - infl.gsm\$deviance)^2)

# compare df
mean((infl.ss\$df - infl.sm\$df)^2)
mean((infl.ss\$df - infl.gsm\$df)^2)
mean((infl.sm\$df - infl.gsm\$df)^2)

# compare sigma
mean((infl.ss\$sigma - infl.sm\$sigma)^2)
mean((infl.ss\$sigma - infl.gsm\$sigma)^2)
mean((infl.sm\$sigma - infl.gsm\$sigma)^2)

# compare residuals
mean((infl.ss\$wt.res - infl.sm\$wt.res)^2)
mean((infl.ss\$wt.res - infl.gsm\$dev.res)^2)
mean((infl.sm\$wt.res - infl.gsm\$dev.res)^2)

# NOTE: ss() coef only comparable to sm() and gsm() after rescaling
scale.sm <- rep(c(1, mod.sm\$specs\$thetas), times = c(2, 10))
scale.gsm <- rep(c(1, mod.gsm\$specs\$thetas), times = c(2, 10))
mean((coef(mod.ss) / scale.sm - coef(mod.sm))^2)
mean((coef(mod.ss) / scale.gsm - coef(mod.gsm))^2)
mean((coef(mod.sm) - coef(mod.gsm))^2)

# infl.ss\$coefficients are *not* comparable to others
mean((infl.ss\$coefficients - infl.sm\$coefficients)^2)
mean((infl.ss\$coefficients - infl.gsm\$coefficients)^2)
mean((infl.sm\$coefficients - infl.gsm\$coefficients)^2)

```

npreg documentation built on July 21, 2022, 1:06 a.m.