# smooth.influence: Nonparametric Regression Diagnostics In npreg: Nonparametric Regression via Smoothing Splines

 smooth.influence R Documentation

## Nonparametric Regression Diagnostics

### Description

These functions provide the basic quantities that are used to form a variety of diagnostics for checking the quality of a fit smoothing spline (fit by ss), smooth model (fit by sm), or generalized smooth model (fit by gsm).

### Usage

## S3 method for class 'ss'
influence(model, do.coef = TRUE, ...)
## S3 method for class 'sm'
influence(model, do.coef = TRUE, ...)
## S3 method for class 'gsm'
influence(model, do.coef = TRUE, ...)

smooth.influence(model, do.coef = TRUE)


### Arguments

 model an object of class "gsm" output by the gsm function, "sm" output by the sm function, or "ss" output by the ss function do.coef logical indicating if the changed coefficients are desired (see Details). ... additional arguments (currently ignored)

### Details

Inspired by influence and lm.influence functions in R's stats package.

The functions documented in smooth.influence.measures provide a more user-friendly way of computing a variety of regression diagnostics.

For non-Gaussian gsm objects, these regression diagnostics are based on one-step approximations, which may be inadequate if a case has high influence.

For all models, the diagostics are computed assuming that the smoothing parameters are fixed at the given values.

### Value

A list with the components

 hat a vector containing the leverages, i.e., the diagonals of the smoothing matrix coefficients if do.coef is true, a matrix whose i-th row contains the change in the estimated coefficients which results when the i-th case is excluded from the fitting. deviance a vector whose i-th entry contains the deviance which results when the i-th case is excluded from the fitting. df a vector whose i-th entry contains the effective degrees-of-freedom which results when the i-th case is excluded from the fitting. sigma a vector whose i-th element contains the estimate of the residual standard deviation obtained when the i-th case is excluded from the fitting. wt.res a vector of weighted (or for class gsm rather deviance) residuals.

### Warning

The approximations used for gsm objects can result in sigma estimates being NaN.

### Note

The coefficients returned by smooth.influence (and the corresponding functions S3 influence methods) are the change in the coefficients which result from dropping each case, i.e., \theta - \theta_i, where \theta are the original coefficients obtained from the full sample of n observations and \theta_i are the coefficients that result from dropping the i-th case.

### Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

### References

See the list in the documentation for influence.measures

Chambers, J. M. (1992) Linear models. Chapter 4 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

ss, sm, gsm for modeling functions

smooth.influence.measures for convenient summary

diagnostic.plots for regression diagnostic plots

### Examples

# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
fx <- 2 + 3 * x + sin(2 * pi * x)
y <- fx + rnorm(n, sd = 0.5)

# fit models
mod.ss <- ss(x, y, nknots = 10)
mod.sm <- sm(y ~ x, knots = 10)
mod.gsm <- gsm(y ~ x, knots = 10)

# calculate influence
infl.ss <- influence(mod.ss)
infl.sm <- influence(mod.sm)
infl.gsm <- influence(mod.gsm)

# compare hat
mean((infl.ss$hat - infl.sm$hat)^2)
mean((infl.ss$hat - infl.gsm$hat)^2)
mean((infl.sm$hat - infl.gsm$hat)^2)

# compare deviance
mean((infl.ss$deviance - infl.sm$deviance)^2)
mean((infl.ss$deviance - infl.gsm$deviance)^2)
mean((infl.sm$deviance - infl.gsm$deviance)^2)

# compare df
mean((infl.ss$df - infl.sm$df)^2)
mean((infl.ss$df - infl.gsm$df)^2)
mean((infl.sm$df - infl.gsm$df)^2)

# compare sigma
mean((infl.ss$sigma - infl.sm$sigma)^2)
mean((infl.ss$sigma - infl.gsm$sigma)^2)
mean((infl.sm$sigma - infl.gsm$sigma)^2)

# compare residuals
mean((infl.ss$wt.res - infl.sm$wt.res)^2)
mean((infl.ss$wt.res - infl.gsm$dev.res)^2)
mean((infl.sm$wt.res - infl.gsm$dev.res)^2)

# NOTE: ss() coef only comparable to sm() and gsm() after rescaling
scale.sm <- rep(c(1, mod.sm$specs$thetas), times = c(2, 10))
scale.gsm <- rep(c(1, mod.gsm$specs$thetas), times = c(2, 10))
mean((coef(mod.ss) / scale.sm - coef(mod.sm))^2)
mean((coef(mod.ss) / scale.gsm - coef(mod.gsm))^2)
mean((coef(mod.sm) - coef(mod.gsm))^2)

# infl.ss$coefficients are *not* comparable to others mean((infl.ss$coefficients - infl.sm$coefficients)^2) mean((infl.ss$coefficients - infl.gsm$coefficients)^2) mean((infl.sm$coefficients - infl.gsm\$coefficients)^2)



npreg documentation built on May 29, 2024, 4:17 a.m.