gsm | R Documentation |
Fits a generalized semi- or nonparametric regression model with the smoothing parameter selected via one of seven methods: GCV, OCV, GACV, ACV, PQL, AIC, or BIC.
gsm(formula, family = gaussian, data, weights, types = NULL, tprk = TRUE, knots = NULL, skip.iter = TRUE, spar = NULL, lambda = NULL, control = list(), method = c("GCV", "OCV", "GACV", "ACV", "PQL", "AIC", "BIC"), xrange = NULL, thetas = NULL, mf = NULL) ## S3 method for class 'gsm' family(object, ...)
Arguments for gsm
:
formula |
Object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. Uses the same syntax as |
family |
Description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function, or the result of a call to a family function. See the "Family Objects" section for details. |
data |
Optional data frame, list or environment (or object coercible by |
weights |
Optional vector of weights to be used in the fitting process. If provided, weighted (penalized) likelihood estimation is used. Defaults to all 1. |
types |
Named list giving the type of smooth to use for each predictor. If |
tprk |
Logical specifying how to parameterize smooth models with multiple predictors. If |
knots |
Spline knots for the estimation of the nonparametric effects. For models with multiple predictors, the knot specification will depend on the |
skip.iter |
Set to |
spar |
Smoothing parameter. Typically (but not always) in the range (0,1]. If specified |
lambda |
Computational smoothing parameter. This value is weighted by n to form the penalty coefficient (see Details). Ignored if |
control |
Optional list with named components that control the optimization specs for the smoothing parameter selection routine. Note that spar is only searched for in the interval [lower, upper].
|
method |
Method for selecting the smoothing parameter. Ignored if |
xrange |
Optional named list containing the range of each predictor. If |
thetas |
Optional vector of hyperparameters to use for smoothing. If |
mf |
Optional model frame constructed from |
Note: the last two arguments are not intended to be called by the typical user of this function. These arguments are included primarily for internal usage by the boot.gsm
function.
Arguments for family.gsm
:
object |
an object of class "gsm" |
... |
additional arguments (currently ignored) |
Letting η_i = η(x_i) with x_i = (x_{i1}, …, x_{ip}), the function is represented as
η = X β + Z α
where the basis functions in X span the null space (i.e., parametric effects), and Z contains the kernel function(s) of the contrast space (i.e., nonparametric effects) evaluated at all combinations of observed data points and knots. The vectors β and α contain unknown basis function coefficients.
Let μ_i = E(y_i) denote the mean of the i-th response. The unknown function is related to the mean μ_i such as
g(μ_i) = η_i
where g() is a known link function. Note that this implies that μ_i = g^{-1}(η_i) given that the link function is assumed to be invertible.
The penalized likelihood estimation problem has the form
- ∑_{i = 1}^n [y_i ξ_i - b(ξ_i)] + n λ α' Q α
where ξ_i is the canonical parameter, b() is a known function that depends on the chosen family, and Q is the penalty matrix. Note that ξ_i = g_0(μ_i) where g_0 is the canonical link function. This implies that ξ_i = η_i when the chosen link function is canonical, i.e., when g = g_0.
An object of class "gsm" with components:
linear.predictors |
the linear fit on link scale. Use |
se.lp |
the standard errors of the linear predictors. |
deviance |
up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero. |
cv.crit |
the cross-validation criterion. |
nsdf |
the degrees of freedom (Df) for the null space. |
df |
the estimated degrees of freedom (Df) for the fit model. |
df.residual |
the residual degrees of freedom = |
r.squared |
the squared correlation between response and fitted values. |
dispersion |
the estimated dispersion parameter. |
logLik |
the log-likelihood. |
aic |
Akaike's Information Criterion. |
bic |
Bayesian Information Criterion. |
spar |
the value of |
lambda |
the value of λ corresponding to |
penalty |
the smoothness penalty α' Q α. |
coefficients |
the basis function coefficients used for the fit model. |
cov.sqrt |
the square-root of the covariance matrix of |
specs |
a list with information used for prediction purposes:
|
data |
the data used to fit the model. |
types |
the type of smooth used for each predictor. |
terms |
the terms included in the fit model. |
method |
the |
formula |
the formula specifying the fit model. |
weights |
the weights used for fitting (if applicable) |
call |
the matched call. |
family |
the input family evaluated as a function using . |
iter |
the number of iterations of IRPLS used. |
residuals |
the working (IRPLS) residuals from the fitted model. |
null.deviance |
the deviance of the null model (i.e., intercept only). |
Supported families and links include:
family | link |
|
binomial | logit, probit, cauchit, log, cloglog | |
gaussian | identity, log, inverse | |
Gamma | inverse, identity, log | |
inverse.gaussian | 1/mu^2, inverse, identity, log | |
poisson | log, identity, sqrt | |
NegBin | log, identity, sqrt | |
See NegBin
for information about the Negative Binomial family.
The smoothing parameter can be selected using one of seven methods:
Generalized Cross-Validation (GCV)
Ordinary Cross-Validation (OCV)
Generalized Approximate Cross-Validation (GACV)
Approximate Cross-Validation (ACV)
Penalized Quasi-Likelihood (PQL)
Akaike's Information Criterion (AIC)
Bayesian Information Criterion (BIC)
The following codes specify the spline types:
par | Parametric effect (factor, integer, or numeric). |
nom | Nominal smoothing spline (unordered factor). |
ord | Ordinal smoothing spline (ordered factor). |
lin | Linear smoothing spline (integer or numeric). |
cub | Cubic smoothing spline (integer or numeric). |
qui | Quintic smoothing spline (integer or numeric). |
per | Periodic smoothing spline (integer or numeric). |
sph | Spherical spline (matrix with d = 2 columns: lat, long). |
tps | Thin plate spline (matrix with d ≥ 1 columns). |
For finer control of some specialized spline types:
per.lin | Linear periodic spline (m = 1). |
per.cub | Cubic periodic spline (m = 2). |
per.qui | Quintic periodic spline (m = 3). |
sph.2 | 2nd order spherical spline (m = 2). |
sph.3 | 3rd order spherical spline (m = 3). |
sph.4 | 4th order spherical spline (m = 4). |
tps.lin | Linear thin plate spline (m = 1). |
tps.cub | Cubic thin plate spline (m = 2). |
tps.qui | Quintic thin plate spline (m = 3). |
For details on the spline kernel functions, see basis.nom
(nominal), basis.ord
(ordinal), basis.poly
(polynomial), basis.sph
(spherical), and basis.tps
(thin plate).
If tprk = TRUE
, the four options for the knots
input include:
1. | a scalar giving the total number of knots to sample |
2. | a vector of integers indexing which rows of data are the knots |
3. | a list with named elements giving the marginal knot values for each predictor (to be combined via expand.grid ) |
4. | a list with named elements giving the knot values for each predictor (requires the same number of knots for each predictor) |
If tprk = FALSE
, the three options for the knots
input include:
1. | a scalar giving the common number of knots for each continuous predictor |
2. | a list with named elements giving the number of marginal knots for each predictor |
3. | a list with named elements giving the marginal knot values for each predictor |
Suppose formula = y ~ x1 + x2
so that the model contains additive effects of two predictor variables.
The k-th predictor's marginal effect can be denoted as
f_k = X_k β_k + Z_k α_k
where X_k is the n by m_k null space basis function matrix, and Z_k is the n by r_k contrast space basis function matrix.
If tprk = TRUE
, the null space basis function matrix has the form X = [1, X_1, X_2] and the contrast space basis function matrix has the form
Z = θ_1 Z_1 + θ_2 Z_2
where the θ_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 = r_2.
If tprk = FALSE
, the null space basis function matrix has the form X = [1, X_1, X_2], and the contrast space basis function matrix has the form
Z = [θ_1 Z_1, θ_2 Z_2]
where the θ_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 + r_2.
When multiple smooth terms are included in the model, there are smoothing (hyper)parameters that weight the contribution of each combination of smooth terms. These hyperparameters are distinct from the overall smoothing parameter lambda
that weights the contribution of the penalty.
skip.iter = TRUE
(default) estimates the smoothing hyperparameters using Algorithm 3.2 of Gu and Wahba (1991), which typically provides adequate results when the model form is correctly specified. The lambda
parameter is tuned via the specified smoothing parameter selection method
.
skip.iter = FALSE
uses Algorithm 3.2 as an initialization, and then the nlm
function is used to tune the hyperparameters via the specified smoothing parameter selection method
. Setting skip.iter = FALSE
can (substantially) increase the model fitting time, but should produce better results—particularly if the model formula
is misspecified.
Nathaniel E. Helwig <helwig@umn.edu>
Berry, L. N., & Helwig, N. E. (2021). Cross-validation, information theory, or maximum likelihood? A comparison of tuning methods for penalized splines. Stats, 4(3), 701-724. doi: 10.3390/stats4030042
Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. doi: 10.1007/BF01404567
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. doi: 10.1007/978-1-4614-5369-7
Gu, C. and Wahba, G. (1991). Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal on Scientific and Statistical Computing, 12(2), 383-398. doi: 10.1137/0912021
Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi: 10.4135/9781526421036885885
Helwig, N. E. (2021). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics, 30(1), 182-191. doi: 10.1080/10618600.2020.1806855
Related Modeling Functions:
ss
for fitting a smoothing spline with a single predictor (Gaussian response).
sm
for fitting smooth models with multiple predictors of mixed types (Gaussian response).
S3 Methods and Related Functions for "gsm" Objects:
boot.gsm
for bootstrapping gsm
objects.
coef.gsm
for extracting coefficients from gsm
objects.
cooks.distance.gsm
for calculating Cook's distances from gsm
objects.
cov.ratio
for computing covariance ratio from gsm
objects.
deviance.gsm
for extracting deviance from gsm
objects.
dfbeta.gsm
for calculating DFBETA from gsm
objects.
dfbetas.gsm
for calculating DFBETAS from gsm
objects.
diagnostic.plots
for plotting regression diagnostics from gsm
objects.
family.gsm
for extracting family
from gsm
objects.
fitted.gsm
for extracting fitted values from gsm
objects.
hatvalues.gsm
for extracting leverages from gsm
objects.
model.matrix.gsm
for constructing model matrix from gsm
objects.
predict.gsm
for predicting from gsm
objects.
residuals.gsm
for extracting residuals from gsm
objects.
rstandard.gsm
for computing standardized residuals from gsm
objects.
rstudent.gsm
for computing studentized residuals from gsm
objects.
smooth.influence
for calculating basic influence information from gsm
objects.
smooth.influence.measures
for convenient display of influential observations from gsm
objects.
summary.gsm
for summarizing gsm
objects.
vcov.gsm
for extracting coefficient covariance matrix from gsm
objects.
weights.gsm
for extracting prior weights from gsm
objects.
########## EXAMPLE 1 ########## ### 1 continuous predictor # generate data n <- 1000 x <- seq(0, 1, length.out = n) fx <- 3 * x + sin(2 * pi * x) - 1.5 # gaussian (default) set.seed(1) y <- fx + rnorm(n, sd = 1/sqrt(2)) mod <- gsm(y ~ x, knots = 10) mean((mod$linear.predictors - fx)^2) # compare to result from sm (they are identical) mod.sm <- sm(y ~ x, knots = 10) mean((mod$linear.predictors - mod.sm$fitted.values)^2) # binomial (no weights) set.seed(1) y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx))) mod <- gsm(y ~ x, family = binomial, knots = 10) mean((mod$linear.predictors - fx)^2) # binomial (w/ weights) set.seed(1) w <- as.integer(rep(c(10,20,30,40,50), length.out = n)) y <- rbinom(n = n, size = w, p = 1 / (1 + exp(-fx))) / w mod <- gsm(y ~ x, family = binomial, weights = w, knots = 10) mean((mod$linear.predictors - fx)^2) # poisson set.seed(1) y <- rpois(n = n, lambda = exp(fx)) mod <- gsm(y ~ x, family = poisson, knots = 10) mean((mod$linear.predictors - fx)^2) # negative binomial (known theta) set.seed(1) y <- rnbinom(n = n, size = 1/2, mu = exp(fx)) mod <- gsm(y ~ x, family = NegBin(theta = 1/2), knots = 10) mean((mod$linear.predictors - fx)^2) mod$family$theta # fixed theta # negative binomial (unknown theta) set.seed(1) y <- rnbinom(n = n, size = 1/2, mu = exp(fx)) mod <- gsm(y ~ x, family = NegBin, knots = 10) mean((mod$linear.predictors - fx)^2) mod$family$theta # estimated theta # gamma set.seed(1) y <- rgamma(n = n, shape = 2, scale = (1 / (2 + fx)) / 2) mod <- gsm(y ~ x, family = Gamma, knots = 10) mean((mod$linear.predictors - fx - 2)^2) # inverse.gaussian (not run; requires statmod) ##set.seed(1) ##y <- statmod::rinvgauss(n = n, mean = sqrt(1 / (2 + fx)), shape = 2) ##mod <- gsm(y ~ x, family = inverse.gaussian, knots = 10) ##mean((mod$linear.predictors - fx - 2)^2) ########## EXAMPLE 2 ########## ### 1 continuous and 1 nominal predictor ### additive model # generate data n <- 1000 x <- seq(0, 1, length.out = n) z <- factor(sample(letters[1:3], size = n, replace = TRUE)) fun <- function(x, z){ mu <- c(-2, 0, 2) zi <- as.integer(z) fx <- mu[zi] + 3 * x + sin(2 * pi * x) - 1.5 } fx <- fun(x, z) # define marginal knots probs <- seq(0, 0.9, by = 0.1) knots <- list(x = quantile(x, probs = probs), z = letters[1:3]) # gaussian (default) set.seed(1) y <- fx + rnorm(n, sd = 1/sqrt(2)) mod <- gsm(y ~ x + z, knots = knots) mean((mod$linear.predictors - fx)^2) # compare to result from sm (they are identical) mod.sm <- sm(y ~ x + z, knots = knots) mean((mod$linear.predictors - mod.sm$fitted.values)^2) # binomial (no weights) set.seed(1) y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx))) mod <- gsm(y ~ x + z, family = binomial, knots = knots) mean((mod$linear.predictors - fx)^2) # binomial (w/ weights) set.seed(1) w <- as.integer(rep(c(10,20,30,40,50), length.out = n)) y <- rbinom(n = n, size = w, p = 1 / (1 + exp(-fx))) / w mod <- gsm(y ~ x + z, family = binomial, weights = w, knots = knots) mean((mod$linear.predictors - fx)^2) # poisson set.seed(1) y <- rpois(n = n, lambda = exp(fx)) mod <- gsm(y ~ x + z, family = poisson, knots = knots) mean((mod$linear.predictors - fx)^2) # negative binomial (known theta) set.seed(1) y <- rnbinom(n = n, size = 1/2, mu = exp(fx)) mod <- gsm(y ~ x + z, family = NegBin(theta = 1/2), knots = knots) mean((mod$linear.predictors - fx)^2) mod$family$theta # fixed theta # negative binomial (unknown theta) set.seed(1) y <- rnbinom(n = n, size = 1/2, mu = exp(fx)) mod <- gsm(y ~ x + z, family = NegBin, knots = knots) mean((mod$linear.predictors - fx)^2) mod$family$theta # estimated theta # gamma set.seed(1) y <- rgamma(n = n, shape = 2, scale = (1 / (4 + fx)) / 2) mod <- gsm(y ~ x + z, family = Gamma, knots = knots) mean((mod$linear.predictors - fx - 4)^2) # inverse.gaussian (not run; requires statmod) ##set.seed(1) ##y <- statmod::rinvgauss(n = n, mean = sqrt(1 / (4 + fx)), shape = 2) ##mod <- gsm(y ~ x + z, family = inverse.gaussian, knots = knots) ##mean((mod$linear.predictors - fx - 4)^2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.