Penalized B-splines in GAMs
gam can use smoothing splines based on univariate B-spline bases
with derivative based penalties, specified via terms like
m controls the spline order, with
m=3 being a cubic spline,
m=2 being quadratic, and so on. The integrated square of the
mth derivative is used as the penalty. So
m=c(3,2) is a convetional cubic spline. If
m is supplied as a single number, then it is taken to be
m=m-1, which is only a conventional smoothing spline in the
m=3, cubic spline case. Notice that the definition of the spline order in terms of
m is intuitive, but differs to that used with the
1 2 3 4
a smooth specification object, usually generated by a term
a list containing just the data (including any
a list containing any knots supplied for basis setup — in same order and with same names as
The basis and penalty are sparse (although sparse matrices are not used to represent them).
m>m will generate an error, since in that case the penalty would be based on an undefined derivative of the basis, which makes no sense.
The default basis dimension,
k, is the larger of 10 and
m is the lower limit on basis dimension. If knots are supplied, then the number of supplied knots should be
k + m + 1, and the range of the middle
k-m+1 knots should include all the covariate values. Alternatively, 2 knots can be supplied, denoting the
lower and upper limits between which the spline can be evaluated (Don't make this range too wide, however, or you
can end up with no information about some basis coefficients, because the corresponding basis functions have a
span that includes no data!). Unlike P-splines, splines with derivative based penalties can have uneven knot spacing, without a problem.
Linear extrapolation is used for prediction that requires extrapolation
(i.e. prediction outside the range of the interior
k-m+1 knots). Such extrapolation is not
allowed in basis construction, but is when predicting.
It is possible to set a
deriv flag in a smooth specification or smooth object, so that a model or prediction matrix produces the requested derivative of the spline, rather than evaluating it.
An object of class
for the elements that this object will contain.
m directly controls the spline order here, which is intuitively sensible, but different to other bases.
Simon N. Wood email@example.com
Wood, S.N. (2016) P-splines with derivative based penalties and tensor product smoothing of unevenly distributed data. Statistics and Computing. http://arxiv.org/abs/1605.02446
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
require(mgcv) set.seed(5) dat <- gamSim(1,n=400,dist="normal",scale=2) bs <- "bs" b <- gam(y~s(x0,bs=bs,m=c(4,2))+s(x1,bs=bs)+s(x2,k=15,bs=bs,m=c(4,3))+ s(x3,bs=bs,m=c(1,0)),data=dat,method="REML") plot(b,pages=1) ## construct smooth of x. Model matrix sm$X and penalty ## matrix sm$S[] will have many zero entries... x <- seq(0,1,length=100) sm <- smoothCon(s(x,bs="bs"),data.frame(x))[] ## another example checking penalty numerically... m <- c(4,2); k <- 15; b <- runif(k) sm <- smoothCon(s(x,bs="bs",m=m,k=k),data.frame(x),scale.penalty=FALSE)[] sm$deriv <- m h0 <- 1e-3; xk <- sm$knots[(m+1):(k+1)] Xp <- PredictMat(sm,data.frame(x=seq(xk+h0/2,max(xk)-h0/2,h0))) sum((Xp%*%b)^2*h0) ## numerical approximation to penalty b%*%sm$S[]%*%b ## `exact' version ## ...repeated with uneven knot spacing... m <- c(4,2); k <- 15; b <- runif(k) ## produce the required 20 unevenly spaced knots... knots <- data.frame(x=c(-.4,-.3,-.2,-.1,-.001,.05,.15, .21,.3,.32,.4,.6,.65,.75,.9,1.001,1.1,1.2,1.3,1.4)) sm <- smoothCon(s(x,bs="bs",m=m,k=k),data.frame(x), knots=knots,scale.penalty=FALSE)[] sm$deriv <- m h0 <- 1e-3; xk <- sm$knots[(m+1):(k+1)] Xp <- PredictMat(sm,data.frame(x=seq(xk+h0/2,max(xk)-h0/2,h0))) sum((Xp%*%b)^2*h0) ## numerical approximation to penalty b%*%sm$S[]%*%b ## `exact' version