Description Usage Arguments Details Value Methods Types of Smooths Choosing Knots Multiple Smooths Author(s) References See Also Examples
Fits a semi or nonparametric regression model with the smoothing parameter selected via one of eight methods: GCV, OCV, GACV, ACV, REML, ML, AIC, or BIC.
1 2 3 
formula 
Object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. Uses the same syntax as 
data 
Optional data frame, list or environment (or object coercible by 
weights 
Optional vector of weights to be used in the fitting process. If provided, weighted least squares is used. Defaults to all 1. 
types 
Named list giving the type of smooth to use for each predictor. If 
tprk 
Logical specifying how to parameterize smooth models with multiple predictors. If 
knots 
Spline knots for the estimation of the nonparametric effects. For models with multiple predictors, the knot specification will depend on the 
update 
If 
df 
Equivalent degrees of freedom (trace of the smoother matrix). Must be in [m,n] where m is the number of columns of the null space basis function matrix X, and n is the number of observations. 
spar 
Smoothing parameter. Typically (but not always) in the range (0,1]. If specified 
lambda 
Computational smoothing parameter. This value is weighted by n to form the penalty coefficient (see Details). Ignored if 
control 
Optional list with named components controlling the root finding when the smoothing parameter spar is computed, i.e., missing or NULL, see below. Note that spar is only searched for in the interval [lower, upper].

method 
Method for selecting the smoothing parameter. Ignored if 
Letting f_i = f(x_i) with x_i = (x_{i1}, …, x_{ip}), the function is represented as
f = X β + Z γ
where the basis functions in X span the null space (i.e., parametric effects), and Z contains the kernel function(s) of the contrast space (i.e., nonparametric effects) evaluated at all combinations of observed data points and knots. The vectors β and γ contain unknown basis function coefficients.
Letting M = (X, Z) and θ = (β', γ')', the penalized least squares problem has the form
(y  M θ)' W (y  M θ) + n λ γ' Q γ
where W is a diagonal matrix containg the weights, and Q is the penalty matrix. The optimal coefficients are the solution to
(M' W M + n λ P) θ = M' W y
where P is the penalty matrix Q augmented with zeros corresponding to the β in θ.
An object of class "sm" with components:
fitted.values 
the fitted values, i.e., predictions. 
se.fit 
the standard errors of the fitted values. 
sse 
the sumofsquared errors. 
cv.crit 
the crossvalidation criterion. 
df 
the estimated degrees of freedom (Df) for the fit model. 
nsdf 
the degrees of freedom (Df) for the null space. 
r.squared 
the observed coefficient of multiple determination. 
sigma 
the estimate of the error standard deviation. 
logLik 
the loglikelihood (if 
aic 
Akaike's Information Criterion (if 
bic 
Bayesian Information Criterion (if 
spar 
the value of 
lambda 
the value of λ corresponding to 
penalty 
the smoothness penalty γ' Q γ. 
coefficients 
the basis function coefficients used for the fit model. 
cov.sqrt 
the squareroot of the covariance matrix of 
specs 
a list with information used for prediction purposes:

data 
the data used to fit the model. 
types 
the type of smooth used for each predictor. 
terms 
the terms included in the fit model. 
method 
the 
formula 
the formula specifying the fit model. 
call 
the matched call. 
The smoothing parameter can be selected using one of eight methods:
Generalized CrossValidation (GCV)
Ordinary CrossValidation (OCV)
Generalized Approximate CrossValidation (GACV)
Approximate CrossValidation (ACV)
Restricted Maximum Likelihood (REML)
Maximum Likelihood (ML)
Akaike's Information Criterion (AIC)
Bayesian Information Criterion (BIC)
The following codes specify the spline types:
par  Parametric effect (factor, integer, or numeric). 
nom  Nominal smoothing spline (unordered factor). 
ord  Ordinal smoothing spline (ordered factor). 
lin  Linear smoothing spline (integer or numeric). 
cub  Cubic smoothing spline (integer or numeric). 
qui  Quintic smoothing spline (integer or numeric). 
per  Periodic smoothing spline (integer or numeric). 
sph  Spherical spline (matrix with d = 3 columns). 
tps  Thinplate spline (matrix with d ≥ 1 columns). 
For finer control of some specialized spline types:
per.lin  Linear periodic spline (m = 1). 
per.cub  Cubic periodic spline (m = 2). 
per.qui  Quintic periodic spline (m = 3). 
sph.lin  Linear spherical spline (m = 1). 
sph.cub  Cubic spherical spline (m = 2). 
sph.qui  Quintic spherical spline (m = 3). 
tps.lin  Linear thinplate spline (m = 1). 
tps.cub  Cubic thinplate spline (m = 2). 
tps.qui  Quintic thinplate spline (m = 3). 
For details on the spline kernel functions, see basis.nom
(nominal), basis.ord
(ordinal), basis.poly
(polynomial), basis.sph
(spherical), and basis.tps
(thinplate).
If tprk = TRUE
, the four options for the knots
input include:
1.  a scalar giving the total number of knots to sample 
2.  a vector of integers indexing which rows of data are the knots 
3.  a list with named elements giving the marginal knot values for each predictor (to be combined via expand.grid ) 
4.  a list with named elements giving the knot values for each predictor (requires the same number of knots for each predictor) 
If tprk = FALSE
, the three options for the knots
input include:
1.  a scalar giving the common number of knots for each continuous predictor 
2.  a list with named elements giving the number of marginal knots for each predictor 
3.  a list with named elements giving the marginal knot values for each predictor 
Suppose formula = y ~ x1 + x2
so that the model contains additive effects of two predictor variables.
The kth predictor's marginal effect can be denoted as
f_k = X_k β_k + Z_k γ_k
where X_k is the n by m_k null space basis function matrix, and Z_k is the n by r_k contrast space basis function matrix.
If tprk = TRUE
, the null space basis function matrix has the form X = [1, X_1, X_2] and the contrast space basis function matrix has the form
Z = θ_1 Z_1 + θ_2 Z_2
where the θ_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 = r_2.
If tprk = FALSE
, the null space basis function matrix has the form X = [1, X_1, X_2], and the contrast space basis function matrix has the form
Z = [θ_1 Z_1, θ_2 Z_2]
where the θ_k are the "extra" smoothing parameters. Note that Z is of dimension n by r = r_1 + r_2.
Nathaniel E. Helwig <helwig@umn.edu>
Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized crossvalidation. Numerische Mathematik, 31, 377403. doi: 10.1007/BF01404567
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. doi: 10.1007/9781461453697
Gu, C. and Wahba, G. (1991). Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal on Scientific and Statistical Computing, 12(2), 383398. doi: 10.1137/0912021
Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi: 10.4135/9781526421036885885
Helwig, N. E. (2020+). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics. doi: 10.1080/10618600.2020.1806855
summary.sm
for summarizing sm
objects.
predict.sm
for predicting from sm
objects.
ss
for fitting a smoothing spline with a single predictor (Gaussian response).
gsm
for fitting generalized smooth models with multiple predictors of mixed types (nonGaussian response).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158  ########## EXAMPLE 1 ##########
### 1 continuous predictor
# generate data
set.seed(1)
n < 100
x < seq(0, 1, length.out = n)
fx < 2 + 3 * x + sin(2 * pi * x)
y < fx + rnorm(n, sd = 0.5)
# fit sm with 10 knots (tprk = TRUE)
sm.ssa < sm(y ~ x, knots = 10)
# fit sm with 10 knots (tprk = FALSE)
sm.gam < sm(y ~ x, knots = 10, tprk = FALSE)
# print both results (note: they are identical)
sm.ssa
sm.gam
# summarize both results (note: they are identical)
summary(sm.ssa)
summary(sm.gam)
# compare true MSE values (note: they are identical)
mean( ( fx  sm.ssa$fit )^2 )
mean( ( fx  sm.gam$fit )^2 )
########## EXAMPLE 2 ##########
### 1 continuous and 1 nominal predictor
### additive model
# generate data
set.seed(1)
n < 100
x < seq(0, 1, length.out = n)
z < factor(sample(letters[1:3], size = n, replace = TRUE))
fun < function(x, z){
mu < c(2, 0, 2)
zi < as.integer(z)
fx < mu[zi] + 3 * x + sin(2 * pi * x)
}
fx < fun(x, z)
y < fx + rnorm(n, sd = 0.5)
# define marginal knots
probs < seq(0, 0.9, by = 0.1)
knots < list(x = quantile(x, probs = probs),
z = letters[1:3])
# fit sm with specified knots (tprk = TRUE)
sm.ssa < sm(y ~ x + z, knots = knots)
# fit sm with specified knots (tprk = FALSE)
sm.gam < sm(y ~ x + z, knots = knots, tprk = FALSE)
# print both results (note: they are identical)
sm.ssa
sm.gam
# summarize both results (note: they are almost identical)
summary(sm.ssa)
summary(sm.gam)
# compare true MSE values (note: they are identical)
mean( ( fx  sm.ssa$fit )^2 )
mean( ( fx  sm.gam$fit )^2 )
########## EXAMPLE 3 ##########
### 1 continuous and 1 nominal predictor
### interaction model
# generate data
set.seed(1)
n < 100
x < seq(0, 1, length.out = n)
z < factor(sample(letters[1:3], size = n, replace = TRUE))
fun < function(x, z){
mu < c(2, 0, 2)
zi < as.integer(z)
fx < mu[zi] + 3 * x + sin(2 * pi * x + mu[zi]*pi/4)
}
fx < fun(x, z)
y < fx + rnorm(n, sd = 0.5)
# define marginal knots
probs < seq(0, 0.9, by = 0.1)
knots < list(x = quantile(x, probs = probs),
z = letters[1:3])
# fit sm with specified knots (tprk = TRUE)
sm.ssa < sm(y ~ x * z, knots = knots)
# fit sm with specified knots (tprk = FALSE)
sm.gam < sm(y ~ x * z, knots = knots, tprk = FALSE)
# print both results (note: they are slightly different)
sm.ssa
sm.gam
# summarize both results (note: they are slightly different)
summary(sm.ssa)
summary(sm.gam)
# compare true MSE values (note: they are slightly different)
mean( ( fx  sm.ssa$fit )^2 )
mean( ( fx  sm.gam$fit )^2 )
########## EXAMPLE 4 ##########
### 4 continuous predictors
### additive model
# generate data
set.seed(1)
n < 100
fun < function(x){
sin(pi*x[,1]) + sin(2*pi*x[,2]) + sin(3*pi*x[,3]) + sin(4*pi*x[,4])
}
data < as.data.frame(replicate(4, runif(n)))
colnames(data) < c("x1v", "x2v", "x3v", "x4v")
fx < fun(data)
y < fx + rnorm(n)
# define marginal knots
knots < list(x1v = quantile(data$x1v, probs = seq(0, 1, length.out = 10)),
x2v = quantile(data$x2v, probs = seq(0, 1, length.out = 10)),
x3v = quantile(data$x3v, probs = seq(0, 1, length.out = 10)),
x4v = quantile(data$x4v, probs = seq(0, 1, length.out = 10)))
# define ssa knot indices
knots.indx < c(bin.sample(data$x1v, nbin = 10, index.return = TRUE)$ix,
bin.sample(data$x2v, nbin = 10, index.return = TRUE)$ix,
bin.sample(data$x3v, nbin = 10, index.return = TRUE)$ix,
bin.sample(data$x4v, nbin = 10, index.return = TRUE)$ix)
# fit sm with specified knots (tprk = TRUE)
sm.ssa < sm(y ~ x1v + x2v + x3v + x4v, data = data, knots = knots.indx)
# fit sm with specified knots (tprk = FALSE)
sm.gam < sm(y ~ x1v + x2v + x3v + x4v, data = data, knots = knots, tprk = FALSE)
# print both results (note: they are slightly different)
sm.ssa
sm.gam
# summarize both results (note: they are slightly different)
summary(sm.ssa)
summary(sm.gam)
# compare true MSE values (note: they are slightly different)
mean( ( fx  sm.ssa$fit )^2 )
mean( ( fx  sm.gam$fit )^2 )

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.