gsm: Fit a Generalized Smooth Model
In npreg: Nonparametric Regression via Smoothing Splines

View source: R/gsm.R

gsm	R Documentation

Fit a Generalized Smooth Model

Description

Fits a generalized semi- or nonparametric regression model with the smoothing parameter selected via one of seven methods: GCV, OCV, GACV, ACV, PQL, AIC, or BIC.

Usage

gsm(formula, family = gaussian, data, weights, types = NULL, tprk = TRUE, 
    knots = NULL, skip.iter = TRUE, spar = NULL, lambda = NULL, control = list(),
    method = c("GCV", "OCV", "GACV", "ACV", "PQL", "AIC", "BIC"),
    xrange = NULL, thetas = NULL, mf = NULL)
    
## S3 method for class 'gsm'
family(object, ...)

Arguments

Arguments for gsm:

`formula`	Object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. Uses the same syntax as `lm` and `glm`.
`family`	Description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function, or the result of a call to a family function. See the "Family Objects" section for details.
`data`	Optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which `sm` is called.
`weights`	Optional vector of weights to be used in the fitting process. If provided, weighted (penalized) likelihood estimation is used. Defaults to all 1.
`types`	Named list giving the type of smooth to use for each predictor. If `NULL`, the type is inferred from the data. See "Types of Smooths" section for details.
`tprk`	Logical specifying how to parameterize smooth models with multiple predictors. If `TRUE` (default), a tensor product reproducing kernel function is used to represent the function. If `FALSE`, a tensor product of marginal kernel functions is used to represent the function. See the "Multiple Smooths" section for details.
`knots`	Spline knots for the estimation of the nonparametric effects. For models with multiple predictors, the knot specification will depend on the `tprk` input. See the "Choosing Knots" section for details
`skip.iter`	Set to `FALSE` for deep tuning of the hyperparameters. Only applicable when multiple smooth terms are included. See the "Parameter Tuning" section for details.
`spar`	Smoothing parameter. Typically (but not always) in the range `(0,1]`. If specified `lambda = 256^(3*(spar-1))`.
`lambda`	Computational smoothing parameter. This value is weighted by `n` to form the penalty coefficient (see Details). Ignored if `spar` is provided.
`control`	Optional list with named components that control the optimization specs for the smoothing parameter selection routine. Note that spar is only searched for in the interval `[lower, upper]`. lower: lower bound for spar; defaults to 0. upper: upper bound for spar; defaults to 1. tol: the absolute precision (tolerance) used by `optimize`; defaults to 1e-8. iterlim: the iteration limit used by `nlm`; defaults to 5000. print.level: the print level used by `nlm`; defaults to 0 (no printing). epsilon: relative convergence tolerance for IRPLS algorithm; defaults to 1e-8 maxit: maximum number of iterations for IRPLS algorithm; defaults to 25 epsilon.out: relative convergence tolerance for iterative NegBin update; defaults to 1e-6 maxit.out: maximum number of iterations for iterative NegBin update; defaults to 10
`method`	Method for selecting the smoothing parameter. Ignored if `lambda` is provided.
`xrange`	Optional named list containing the range of each predictor. If `NULL`, the ranges are calculated from the input `data`.
`thetas`	Optional vector of hyperparameters to use for smoothing. If `NULL`, these are tuned using the requested `method`.
`mf`	Optional model frame constructed from `formula` and `data` (and potentially `weights`).

Note: the last two arguments are not intended to be called by the typical user of this function. These arguments are included primarily for internal usage by the boot.gsm function.

Arguments for family.gsm:

`object`	an object of class "gsm"
`...`	additional arguments (currently ignored)

Details

Letting \eta_i = \eta(x_i) with x_i = (x_{i1}, \ldots, x_{ip}), the function is represented as

\eta = X \beta + Z \alpha

where the basis functions in X span the null space (i.e., parametric effects), and Z contains the kernel function(s) of the contrast space (i.e., nonparametric effects) evaluated at all combinations of observed data points and knots. The vectors \beta and \alpha contain unknown basis function coefficients.

Let \mu_i = E(y_i) denote the mean of the i-th response. The unknown function is related to the mean \mu_i such as

g(\mu_i) = \eta_i

where g() is a known link function. Note that this implies that \mu_i = g^{-1}(\eta_i) given that the link function is assumed to be invertible.

The penalized likelihood estimation problem has the form

- \sum_{i = 1}^n [y_i \xi_i - b(\xi_i)] + n \lambda \alpha' Q \alpha

where \xi_i is the canonical parameter, b() is a known function that depends on the chosen family, and Q is the penalty matrix. Note that \xi_i = g_0(\mu_i) where g_0 is the canonical link function. This implies that \xi_i = \eta_i when the chosen link function is canonical, i.e., when g = g_0.

Value

An object of class "gsm" with components:

`linear.predictors`	the linear fit on link scale. Use `fitted.gsm` to obtain the fitted values on the response scale.
`se.lp`	the standard errors of the linear predictors.
`deviance`	up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero.
`cv.crit`	the cross-validation criterion.
`nsdf`	the degrees of freedom (Df) for the null space.
`df`	the estimated degrees of freedom (Df) for the fit model.
`df.residual`	the residual degrees of freedom = `nobs - df`
`r.squared`	the squared correlation between response and fitted values.
`dispersion`	the estimated dispersion parameter.
`logLik`	the log-likelihood.
`aic`	Akaike's Information Criterion.
`bic`	Bayesian Information Criterion.
`spar`	the value of `spar` computed or given, i.e., `s = 1 + \log_{256}(\lambda)/3`
`lambda`	the value of `\lambda` corresponding to `spar`, i.e., `\lambda = 256^{3*(s-1)}`.
`penalty`	the smoothness penalty `\alpha' Q \alpha`.
`coefficients`	the basis function coefficients used for the fit model.
`cov.sqrt`	the square-root of the covariance matrix of `coefficients`. Note: `tcrossprod(cov.sqrt)` reconstructs the covariance matrix.
`specs`	a list with information used for prediction purposes: knots the spline knots used for each predictor. thetas the "extra" tuning parameters used to weight the penalties. xrng the ranges of the predictor variables. xlev the factor levels of the predictor variables (if applicable). tprk logical controlling the formation of tensor product smooths.
`data`	the data used to fit the model.
`types`	the type of smooth used for each predictor.
`terms`	the terms included in the fit model.
`method`	the `method` used for smoothing parameter selection. Will be `NULL` if `lambda` was provided.
`formula`	the formula specifying the fit model.
`weights`	the weights used for fitting (if applicable)
`call`	the matched call.
`family`	the input family evaluated as a function using .
`iter`	the number of iterations of IRPLS used.
`residuals`	the working (IRPLS) residuals from the fitted model.
`null.deviance`	the deviance of the null model (i.e., intercept only).

Family Objects

Supported families and links include:

`family`		`link`
binomial		logit, probit, cauchit, log, cloglog
gaussian		identity, log, inverse
Gamma		inverse, identity, log
inverse.gaussian		1/mu^2, inverse, identity, log
poisson		log, identity, sqrt
NegBin		log, identity, sqrt

See NegBin for information about the Negative Binomial family.

Methods

The smoothing parameter can be selected using one of seven methods:
Generalized Cross-Validation (GCV)
Ordinary Cross-Validation (OCV)
Generalized Approximate Cross-Validation (GACV)
Approximate Cross-Validation (ACV)
Penalized Quasi-Likelihood (PQL)
Akaike's Information Criterion (AIC)
Bayesian Information Criterion (BIC)

Types of Smooths

The following codes specify the spline types:

par	Parametric effect (factor, integer, or numeric).
ran	Random effect/intercept (unordered factor).
nom	Nominal smoothing spline (unordered factor).
ord	Ordinal smoothing spline (ordered factor).
lin	Linear smoothing spline (integer or numeric).
cub	Cubic smoothing spline (integer or numeric).
qui	Quintic smoothing spline (integer or numeric).
per	Periodic smoothing spline (integer or numeric).
sph	Spherical spline (matrix with `d = 2` columns: lat, long).
tps	Thin plate spline (matrix with `d \ge 1` columns).

For finer control of some specialized spline types:

per.lin	Linear periodic spline (`m = 1`).
per.cub	Cubic periodic spline (`m = 2`).
per.qui	Quintic periodic spline (`m = 3`).
sph.2	2nd order spherical spline (`m = 2`).
sph.3	3rd order spherical spline (`m = 3`).
sph.4	4th order spherical spline (`m = 4`).
tps.lin	Linear thin plate spline (`m = 1`).
tps.cub	Cubic thin plate spline (`m = 2`).
tps.qui	Quintic thin plate spline (`m = 3`).

For details on the spline kernel functions, see basis.nom (nominal), basis.ord (ordinal), basis.poly (polynomial), basis.sph (spherical), and basis.tps (thin plate).

Note: "ran" is default for unordered factors when the number of levels is 20 or more, whereas "nom" is the default for unordered factors otherwise.

Choosing Knots

If tprk = TRUE, the four options for the knots input include:

1.	a scalar giving the total number of knots to sample
2.	a vector of integers indexing which rows of data are the knots
3.	a list with named elements giving the marginal knot values for each predictor (to be combined via `expand.grid`)
4.	a list with named elements giving the knot values for each predictor (requires the same number of knots for each predictor)

If tprk = FALSE, the three options for the knots input include:

1.	a scalar giving the common number of knots for each continuous predictor
2.	a list with named elements giving the number of marginal knots for each predictor
3.	a list with named elements giving the marginal knot values for each predictor

Multiple Smooths

Suppose formula = y ~ x1 + x2 so that the model contains additive effects of two predictor variables.

The k-th predictor's marginal effect can be denoted as

f_k = X_k \beta_k + Z_k \alpha_k

where X_k is the n by m_k null space basis function matrix, and Z_k is the n by r_k contrast space basis function matrix.

If tprk = TRUE, the null space basis function matrix has the form X = [1, X_1, X_2] and the contrast space basis function matrix has the form

Z = \theta_1 Z_1 + \theta_2 Z_2

where the \theta_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 = r_2.

If tprk = FALSE, the null space basis function matrix has the form X = [1, X_1, X_2], and the contrast space basis function matrix has the form

Z = [\theta_1 Z_1, \theta_2 Z_2]

where the \theta_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 + r_2.

Parameter Tuning

When multiple smooth terms are included in the model, there are smoothing (hyper)parameters that weight the contribution of each combination of smooth terms. These hyperparameters are distinct from the overall smoothing parameter lambda that weights the contribution of the penalty.

skip.iter = TRUE (default) estimates the smoothing hyperparameters using Algorithm 3.2 of Gu and Wahba (1991), which typically provides adequate results when the model form is correctly specified. The lambda parameter is tuned via the specified smoothing parameter selection method.

skip.iter = FALSE uses Algorithm 3.2 as an initialization, and then the nlm function is used to tune the hyperparameters via the specified smoothing parameter selection method. Setting skip.iter = FALSE can (substantially) increase the model fitting time, but should produce better results—particularly if the model formula is misspecified.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Berry, L. N., & Helwig, N. E. (2021). Cross-validation, information theory, or maximum likelihood? A comparison of tuning methods for penalized splines. Stats, 4(3), 701-724. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3390/stats4030042")}

Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/BF01404567")}

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-1-4614-5369-7")}

Gu, C. and Wahba, G. (1991). Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal on Scientific and Statistical Computing, 12(2), 383-398. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1137/0912021")}

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.4135/9781526421036885885")}

Helwig, N. E. (2021). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics, 30(1), 182-191. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2020.1806855")}

Helwig, N. E. (2024). Precise tensor product smoothing via spectral splines. Stats,

Examples

##########   EXAMPLE 1   ##########
### 1 continuous predictor

# generate data
n <- 1000
x <- seq(0, 1, length.out = n)
fx <- 3 * x + sin(2 * pi * x) - 1.5

# gaussian (default)
set.seed(1)
y <- fx + rnorm(n, sd = 1/sqrt(2))
mod <- gsm(y ~ x, knots = 10)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# compare to result from sm (they are identical)
mod.sm <- sm(y ~ x, knots = 10)
plot(mod.sm)
mean((mod$linear.predictors - mod.sm$fitted.values)^2)

# binomial (no weights)
set.seed(1)
y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx)))
mod <- gsm(y ~ x, family = binomial, knots = 10)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# binomial (w/ weights)
set.seed(1)
w <- as.integer(rep(c(10,20,30,40,50), length.out = n))
y <- rbinom(n = n, size = w, p = 1 / (1 + exp(-fx))) / w
mod <- gsm(y ~ x, family = binomial, weights = w, knots = 10)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# poisson
set.seed(1)
y <- rpois(n = n, lambda = exp(fx))
mod <- gsm(y ~ x, family = poisson, knots = 10)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# negative binomial (known theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x, family = NegBin(theta = 1/2), knots = 10)
plot(mod)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # fixed theta

# negative binomial (unknown theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x, family = NegBin, knots = 10)
plot(mod)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # estimated theta

# gamma
set.seed(1)
y <- rgamma(n = n, shape = 2, scale = (1 / (2 + fx)) / 2)
mod <- gsm(y ~ x, family = Gamma, knots = 10)
plot(mod)
mean((mod$linear.predictors - fx - 2)^2)

# inverse.gaussian (not run; requires statmod)
##set.seed(1)
##y <- statmod::rinvgauss(n = n, mean = sqrt(1 / (2 + fx)), shape = 2)
##mod <- gsm(y ~ x, family = inverse.gaussian, knots = 10)
##plot(mod)
##mean((mod$linear.predictors - fx - 2)^2)



##########   EXAMPLE 2   ##########
### 1 continuous and 1 nominal predictor
### additive model

# generate data
n <- 1000
x <- seq(0, 1, length.out = n)
z <- factor(sample(letters[1:3], size = n, replace = TRUE))
fun <- function(x, z){
  mu <- c(-2, 0, 2)
  zi <- as.integer(z)
  fx <- mu[zi] + 3 * x + sin(2 * pi * x) - 1.5
}
fx <- fun(x, z)

# define marginal knots
probs <- seq(0, 0.9, by = 0.1)
knots <- list(x = quantile(x, probs = probs),
              z = letters[1:3])
              
# gaussian (default)
set.seed(1)
y <- fx + rnorm(n, sd = 1/sqrt(2))
mod <- gsm(y ~ x + z, knots = knots)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# compare to result from sm (they are identical)
mod.sm <- sm(y ~ x + z, knots = knots)
plot(mod.sm)
mean((mod$linear.predictors - mod.sm$fitted.values)^2)

# binomial (no weights)
set.seed(1)
y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx)))
mod <- gsm(y ~ x + z, family = binomial, knots = knots)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# binomial (w/ weights)
set.seed(1)
w <- as.integer(rep(c(10,20,30,40,50), length.out = n))
y <- rbinom(n = n, size = w, p = 1 / (1 + exp(-fx))) / w
mod <- gsm(y ~ x + z, family = binomial, weights = w, knots = knots)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# poisson
set.seed(1)
y <- rpois(n = n, lambda = exp(fx))
mod <- gsm(y ~ x + z, family = poisson, knots = knots)
plot(mod)
mean((mod$linear.predictors - fx)^2)

# negative binomial (known theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x + z, family = NegBin(theta = 1/2), knots = knots)
plot(mod)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # fixed theta

# negative binomial (unknown theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x + z, family = NegBin, knots = knots)
plot(mod)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # estimated theta

# gamma
set.seed(1)
y <- rgamma(n = n, shape = 2, scale = (1 / (4 + fx)) / 2)
mod <- gsm(y ~ x + z, family = Gamma, knots = knots)
plot(mod)
mean((mod$linear.predictors - fx - 4)^2)

# inverse.gaussian (not run; requires statmod)
##set.seed(1)
##y <- statmod::rinvgauss(n = n, mean = sqrt(1 / (4 + fx)), shape = 2)
##mod <- gsm(y ~ x + z, family = inverse.gaussian, knots = knots)
##plot(mod)
##mean((mod$linear.predictors - fx - 4)^2)

npreg documentation built on May 29, 2024, 4:17 a.m.

npreg index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

npreg
Nonparametric Regression via Smoothing Splines

gsm: Fit a Generalized Smooth Model
In npreg: Nonparametric Regression via Smoothing Splines

Fit a Generalized Smooth Model

Description

Usage

Arguments

Details

Value

Family Objects

Methods

Types of Smooths

Choosing Knots

Multiple Smooths

Parameter Tuning

Author(s)

References

See Also

Examples

Related to gsm in npreg...

R Package Documentation

Browse R Packages

We want your feedback!

npreg Nonparametric Regression via Smoothing Splines

gsm: Fit a Generalized Smooth Model In npreg: Nonparametric Regression via Smoothing Splines

Fit a Generalized Smooth Model

Description

Usage

Arguments

Details

Value

Family Objects

Methods

Types of Smooths

Choosing Knots

Multiple Smooths

Parameter Tuning

Author(s)

References

See Also

Examples

Related to gsm in npreg...

R Package Documentation

Browse R Packages

We want your feedback!

npreg
Nonparametric Regression via Smoothing Splines

gsm: Fit a Generalized Smooth Model
In npreg: Nonparametric Regression via Smoothing Splines