# s: Defining Smooths in VGAM Formulas In VGAM: Vector Generalized Linear and Additive Models

 s R Documentation

## Defining Smooths in VGAM Formulas

### Description

`s` is used in the definition of (vector) smooth terms within `vgam` formulas. This corresponds to 1st-generation VGAMs that use backfitting for their estimation. The effective degrees of freedom is prespecified.

### Usage

``````s(x, df = 4, spar = 0, ...)
``````

### Arguments

 `x` covariate (abscissae) to be smoothed. Note that `x` must be a single variable and not a function of a variable. For example, `s(x)` is fine but `s(log(x))` will fail. In this case, let `logx <- log(x)` (in the data frame), say, and then use `s(logx)`. At this stage bivariate smoothers (`x` would be a two-column matrix) are not implemented. `df` numerical vector of length `r`. Effective degrees of freedom: must lie between 1 (linear fit) and `n` (interpolation). Thus one could say that `df-1` is the effective nonlinear degrees of freedom (ENDF) of the smooth. Recycling of values will be used if `df` is not of length `r`. If `spar` is positive then this argument is ignored. Thus `s()` means that the effective degrees of freedom is prespecified. If it is known that the component function(s) are more wiggly than usual then try increasing the value of this argument. `spar` numerical vector of length `r`. Positive smoothing parameters (after scaling) . Larger values mean more smoothing so that the solution approaches a linear fit for that component function. A zero value means that `df` is used. Recycling of values will be used if `spar` is not of length `r`. `...` Ignored for now.

### Details

In this help file `M` is the number of additive predictors and `r` is the number of component functions to be estimated (so that `r` is an element from the set {1,2,...,`M`}). Also, if `n` is the number of distinct abscissae, then `s` will fail if `n < 7`.

`s`, which is symbolic and does not perform any smoothing itself, only handles a single covariate. Note that `s` works in `vgam` only. It has no effect in `vglm` (actually, it is similar to the identity function `I` so that `s(x2)` is the same as `x2` in the LM model matrix). It differs from the `s()` of the gam package and the `s` of the mgcv package; they should not be mixed together. Also, terms involving `s` should be simple additive terms, and not involving interactions and nesting etc. For example, `myfactor:s(x2)` is not a good idea.

### Value

A vector with attributes that are (only) used by `vgam`.

### Note

The vector cubic smoothing spline which `s()` represents is computationally demanding for large `M`. The cost is approximately `O(n M^3)` where `n` is the number of unique abscissae.

Currently a bug relating to the use of `s()` is that only constraint matrices whose columns are orthogonal are handled correctly. If any `s()` term has a constraint matrix that does not satisfy this condition then a warning is issued. See `is.buggy` for more information.

A more modern alternative to using `s` with `vgam` is to use `sm.os` or `sm.ps`. This does not require backfitting and allows automatic smoothing parameter selection. However, this alternative should only be used when the sample size is reasonably large (`> 500`, say). These are called Generation-2 VGAMs.

Another alternative to using `s` with `vgam` is `bs` and/or `ns` with `vglm`. The latter implements half-stepping, which is helpful if convergence is difficult.

Thomas W. Yee

### References

Yee, T. W. and Wild, C. J. (1996). Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.

`vgam`, `is.buggy`, `sm.os`, `sm.ps`, `vsmooth.spline`.

### Examples

``````# Nonparametric logistic regression
fit1 <- vgam(agaaus ~ s(altitude, df = 2), binomialff, data = hunua)
## Not run:  plot(fit1, se = TRUE)

# Bivariate logistic model with artificial data
nn <- 300
bdata <- data.frame(x1 = runif(nn), x2 = runif(nn))
bdata <- transform(bdata,
y1 = rbinom(nn, size = 1, prob = logitlink(sin(2 * x2), inverse = TRUE)),
y2 = rbinom(nn, size = 1, prob = logitlink(sin(2 * x2), inverse = TRUE)))
fit2 <- vgam(cbind(y1, y2) ~ x1 + s(x2, 3), trace = TRUE,
binom2.or(exchangeable = TRUE), data = bdata)
coef(fit2, matrix = TRUE)  # Hard to interpret
## Not run:  plot(fit2, se = TRUE, which.term = 2, scol = "blue")
``````

VGAM documentation built on Sept. 19, 2023, 9:06 a.m.