gsm: Fit a Generalized Smooth Model

Description Usage Arguments Details Value Family Objects Methods Types of Smooths Choosing Knots Multiple Smooths Note Author(s) References See Also Examples

View source: R/gsm.R

Description

Fits a generalized semi- or nonparametric regression model with the smoothing parameter selected via one of seven methods: GCV, OCV, GACV, ACV, PQL, AIC, or BIC.

Usage

1
2
3
gsm(formula, family = gaussian, data, weights, types = NULL, tprk = TRUE, 
    knots = NULL, update = TRUE, spar = NULL, lambda = NULL, control = list(),
    method = c("GCV", "OCV", "GACV", "ACV", "PQL", "AIC", "BIC"))

Arguments

formula

Object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. Uses the same syntax as lm and glm.

family

Description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function, or the result of a call to a family function. See the "Family Objects" section for details.

data

Optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which sm is called.

weights

Optional vector of weights to be used in the fitting process. If provided, weighted (penalized) likelihood estimation is used. Defaults to all 1.

types

Named list giving the type of smooth to use for each predictor. If NULL, the type is inferred from the data. See "Types of Smooths" section for details.

tprk

Logical specifying how to parameterize smooth models with multiple predictors. If TRUE (default), a tensor product reproducing kernel function is used to represent the function. If FALSE, a tensor product of marginal kernel functions is used to represent the function. See the "Multiple Smooths" section for details.

knots

Spline knots for the estimation of the nonparametric effects. For models with multiple predictors, the knot specification will depend on the tprk input. See the "Choosing Knots" section for details

update

If TRUE, steps 1-2 of Gu and Wahba's (1991) algorithm 3.2 are used to update the "extra" smoothing parameters. If FALSE, only step 1 of algorithm 3.2 is used, so each effect is given equal influence on the penalty. Only applicable when multiple smooth terms are included.

spar

Smoothing parameter. Typically (but not always) in the range (0,1]. If specified lambda = 256^(3*(spar-1)).

lambda

Computational smoothing parameter. This value is weighted by n to form the penalty coefficient (see Details). Ignored if spar is provided.

control

Optional list with named components controlling the root finding when the smoothing parameter spar is computed, i.e., missing or NULL, see below.

Note that spar is only searched for in the interval [lower, upper].

lower:

lower bound for spar; defaults to 0.

upper:

upper bound for spar; defaults to 1.

tol:

the absolute precision (tolerance) used by optimize; defaults to 1e-8.

method

Method for selecting the smoothing parameter. Ignored if lambda is provided.

Details

Letting η_i = η(x_i) with x_i = (x_{i1}, …, x_{ip}), the function is represented as

η = X β + Z γ

where the basis functions in X span the null space (i.e., parametric effects), and Z contains the kernel function(s) of the contrast space (i.e., nonparametric effects) evaluated at all combinations of observed data points and knots. The vectors β and γ contain unknown basis function coefficients.

Let μ_i = E(y_i) denote the mean of the i-th response. The unknown function is related to the mean μ_i such as

g(μ_i) = η_i

where g() is a known link function. Note that this implies that μ_i = g^{-1}(η_i) given that the link function is assumed to be invertible.

The penalized likelihood estimation problem has the form

- ∑_{i = 1}^n [y_i θ_i - b(θ_i)] + n λ γ' Q γ

where θ_i is the canonical parameter, b() is a known function that depends on the chosen family, and Q is the penalty matrix. Note that θ_i = g_0(μ_i) where g_0 is the canonical link function. This implies that θ_i = η_i when the chosen link function is canonical, i.e., when g = g_0.

Value

An object of class "gsm" with components:

linear.predictors

the linear fit on link scale. See the Note for obtaining the fitted values on the response scale.

se.lp

the standard errors of the linear predictors.

deviance

up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero.

cv.crit

the cross-validation criterion.

df

the estimated degrees of freedom (Df) for the fit model.

nsdf

the degrees of freedom (Df) for the null space.

r.squared

the squared correlation between response and fitted values.

dispersion

the estimated dispersion parameter.

logLik

the log-likelihood.

aic

Akaike's Information Criterion.

bic

Bayesian Information Criterion.

spar

the value of spar computed or given, i.e., s = 1 + \log_{256}(λ)/3

lambda

the value of λ corresponding to spar, i.e., λ = 256^{3*(s-1)}.

penalty

the smoothness penalty γ' Q γ.

coefficients

the basis function coefficients used for the fit model.

cov.sqrt

the square-root of the covariance matrix of coefficients. Note: tcrossprod(cov.sqrt) reconstructs the covariance matrix.

specs

a list with information used for prediction purposes:

knots

the spline knots used for each predictor.

thetas

the "extra" tuning parameters used to weight the penalties.

xrng

the ranges of the predictor variables.

xlev

the factor levels of the predictor variables (if applicable).

tprk

logical controlling the formation of tensor product smooths.

data

the data used to fit the model.

types

the type of smooth used for each predictor.

terms

the terms included in the fit model.

method

the method used for smoothing parameter selection. Will be NULL if lambda was provided.

formula

the formula specifying the fit model.

call

the matched call.

family

the input family evaluated as a function using .

iter

the number of iterations of IRPLS used.

residuals

the working (IRPLS) residuals from the fitted model.

null.deviance

the deviance of the null model (i.e., intercept only).

Family Objects

Supported families and links include:

family link
binomial logit, probit, cauchit, log, cloglog
gaussian identity, log, inverse
Gamma inverse, identity, log
inverse.gaussian 1/mu^2, inverse, identity, log
poisson log, identity, sqrt
NegBin log, identity, sqrt

See NegBin for information about the Negative Binomial family.

Methods

The smoothing parameter can be selected using one of seven methods:
Generalized Cross-Validation (GCV)
Ordinary Cross-Validation (OCV)
Generalized Approximate Cross-Validation (GACV)
Approximate Cross-Validation (ACV)
Penalized Quasi-Likelihood (PQL)
Akaike's Information Criterion (AIC)
Bayesian Information Criterion (BIC)

Types of Smooths

The following codes specify the spline types:

par Parametric effect (factor, integer, or numeric).
nom Nominal smoothing spline (unordered factor).
ord Ordinal smoothing spline (ordered factor).
lin Linear smoothing spline (integer or numeric).
cub Cubic smoothing spline (integer or numeric).
qui Quintic smoothing spline (integer or numeric).
per Periodic smoothing spline (integer or numeric).
sph Spherical spline (matrix with d = 3 columns).
tps Thin-plate spline (matrix with d ≥ 1 columns).

For finer control of some specialized spline types:

per.lin Linear periodic spline (m = 1).
per.cub Cubic periodic spline (m = 2).
per.qui Quintic periodic spline (m = 3).
sph.lin Linear spherical spline (m = 1).
sph.cub Cubic spherical spline (m = 2).
sph.qui Quintic spherical spline (m = 3).
tps.lin Linear thin-plate spline (m = 1).
tps.cub Cubic thin-plate spline (m = 2).
tps.qui Quintic thin-plate spline (m = 3).

For details on the spline kernel functions, see basis.nom (nominal), basis.ord (ordinal), basis.poly (polynomial), basis.sph (spherical), and basis.tps (thin-plate).

Choosing Knots

If tprk = TRUE, the four options for the knots input include:

1. a scalar giving the total number of knots to sample
2. a vector of integers indexing which rows of data are the knots
3. a list with named elements giving the marginal knot values for each predictor (to be combined via expand.grid)
4. a list with named elements giving the knot values for each predictor (requires the same number of knots for each predictor)

If tprk = FALSE, the three options for the knots input include:

1. a scalar giving the common number of knots for each continuous predictor
2. a list with named elements giving the number of marginal knots for each predictor
3. a list with named elements giving the marginal knot values for each predictor

Multiple Smooths

Suppose formula = y ~ x1 + x2 so that the model contains additive effects of two predictor variables.

The k-th predictor's marginal effect can be denoted as

f_k = X_k β_k + Z_k γ_k

where X_k is the n by m_k null space basis function matrix, and Z_k is the n by r_k contrast space basis function matrix.

If tprk = TRUE, the null space basis function matrix has the form X = [1, X_1, X_2] and the contrast space basis function matrix has the form

Z = θ_1 Z_1 + θ_2 Z_2

where the θ_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 = r_2.

If tprk = FALSE, the null space basis function matrix has the form X = [1, X_1, X_2], and the contrast space basis function matrix has the form

Z = [θ_1 Z_1, θ_2 Z_2]

where the θ_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 + r_2.

Note

The fitted values on the response scale can be obtained using
ginv <- object$family$linkinv
fit <- ginv(object$linear.predictors)
where object is the fit "gsm" object.

For models with multiple predictors, the predict.gsm function may be more useful.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. doi: 10.1007/BF01404567

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. doi: 10.1007/978-1-4614-5369-7

Gu, C. and Wahba, G. (1991). Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal on Scientific and Statistical Computing, 12(2), 383-398. doi: 10.1137/0912021

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi: 10.4135/9781526421036885885

Helwig, N. E. (2020+). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics. doi: 10.1080/10618600.2020.1806855

See Also

summary.gsm for summarizing gsm objects.

predict.gsm for predicting from gsm objects.

sm for fitting smooth models to Gaussian data.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
##########   EXAMPLE 1   ##########
### 1 continuous predictor

# generate data
n <- 1000
x <- seq(0, 1, length.out = n)
fx <- 3 * x + sin(2 * pi * x) - 1.5

# gaussian (default)
set.seed(1)
y <- fx + rnorm(n, sd = 1/sqrt(2))
mod <- gsm(y ~ x, knots = 10)
mean((mod$linear.predictors - fx)^2)

# compare to result from sm (they are identical)
mod.sm <- sm(y ~ x, knots = 10)
mean((mod$linear.predictors - mod.sm$fitted.values)^2)

# binomial (no weights)
set.seed(1)
y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx)))
mod <- gsm(y ~ x, family = binomial, knots = 10)
mean((mod$linear.predictors - fx)^2)

# binomial (w/ weights)
set.seed(1)
w <- as.integer(rep(c(10,20,30,40,50), length.out = n))
y <- rbinom(n = n, size = w, p = 1 / (1 + exp(-fx))) / w
mod <- gsm(y ~ x, family = binomial, weights = w, knots = 10)
mean((mod$linear.predictors - fx)^2)

# poisson
set.seed(1)
y <- rpois(n = n, lambda = exp(fx))
mod <- gsm(y ~ x, family = poisson, knots = 10)
mean((mod$linear.predictors - fx)^2)

# negative binomial (known theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x, family = NegBin(theta = 1/2), knots = 10)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # fixed theta

# negative binomial (unknown theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x, family = NegBin, knots = 10)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # estimated theta

# gamma
set.seed(1)
y <- rgamma(n = n, shape = 2, scale = (1 / (2 + fx)) / 2)
mod <- gsm(y ~ x, family = Gamma, knots = 10)
mean((mod$linear.predictors - fx - 2)^2)

# inverse.gaussian (not run; requires statmod)
##set.seed(1)
##y <- statmod::rinvgauss(n = n, mean = sqrt(1 / (2 + fx)), shape = 2)
##mod <- gsm(y ~ x, family = inverse.gaussian, knots = 10)
##mean((mod$linear.predictors - fx - 2)^2)



##########   EXAMPLE 2   ##########
### 1 continuous and 1 nominal predictor
### additive model

# generate data
n <- 1000
x <- seq(0, 1, length.out = n)
z <- factor(sample(letters[1:3], size = n, replace = TRUE))
fun <- function(x, z){
  mu <- c(-2, 0, 2)
  zi <- as.integer(z)
  fx <- mu[zi] + 3 * x + sin(2 * pi * x) - 1.5
}
fx <- fun(x, z)

# define marginal knots
probs <- seq(0, 0.9, by = 0.1)
knots <- list(x = quantile(x, probs = probs),
              z = letters[1:3])
              
# gaussian (default)
set.seed(1)
y <- fx + rnorm(n, sd = 1/sqrt(2))
mod <- gsm(y ~ x + z, knots = knots)
mean((mod$linear.predictors - fx)^2)

# compare to result from sm (they are identical)
mod.sm <- sm(y ~ x + z, knots = knots)
mean((mod$linear.predictors - mod.sm$fitted.values)^2)

# binomial (no weights)
set.seed(1)
y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx)))
mod <- gsm(y ~ x + z, family = binomial, knots = knots)
mean((mod$linear.predictors - fx)^2)

# binomial (w/ weights)
set.seed(1)
w <- as.integer(rep(c(10,20,30,40,50), length.out = n))
y <- rbinom(n = n, size = w, p = 1 / (1 + exp(-fx))) / w
mod <- gsm(y ~ x + z, family = binomial, weights = w, knots = knots)
mean((mod$linear.predictors - fx)^2)

# poisson
set.seed(1)
y <- rpois(n = n, lambda = exp(fx))
mod <- gsm(y ~ x + z, family = poisson, knots = knots)
mean((mod$linear.predictors - fx)^2)

# negative binomial (known theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x + z, family = NegBin(theta = 1/2), knots = knots)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # fixed theta

# negative binomial (unknown theta)
set.seed(1)
y <- rnbinom(n = n, size = 1/2, mu = exp(fx))
mod <- gsm(y ~ x + z, family = NegBin, knots = knots)
mean((mod$linear.predictors - fx)^2)
mod$family$theta    # estimated theta

# gamma
set.seed(1)
y <- rgamma(n = n, shape = 2, scale = (1 / (4 + fx)) / 2)
mod <- gsm(y ~ x + z, family = Gamma, knots = knots)
mean((mod$linear.predictors - fx - 4)^2)

# inverse.gaussian (not run; requires statmod)
##set.seed(1)
##y <- statmod::rinvgauss(n = n, mean = sqrt(1 / (4 + fx)), shape = 2)
##mod <- gsm(y ~ x + z, family = inverse.gaussian, knots = knots)
##mean((mod$linear.predictors - fx - 4)^2)

npreg documentation built on April 23, 2021, 9:07 a.m.

Related to gsm in npreg...