Penalized B-splines in GAMs

Share:

Description

gam can use smoothing splines based on univariate B-spline bases with derivative based penalties, specified via terms like s(x,bs="bs",m=c(3,2)). m[1] controls the spline order, with m[1]=3 being a cubic spline, m[1]=2 being quadratic, and so on. The integrated square of the m[2]th derivative is used as the penalty. So m=c(3,2) is a convetional cubic spline. If m is supplied as a single number, then it is taken to be m[1] and m[2]=m[1]-1, which is only a conventional smoothing spline in the m=3, cubic spline case. Notice that the definition of the spline order in terms of m[1] is intuitive, but differs to that used with the tprs and p.spline bases.

Usage

1
2
3
4
## S3 method for class 'bs.smooth.spec'
smooth.construct(object, data, knots)
## S3 method for class 'Bspline.smooth'
Predict.matrix(object, data)

Arguments

object

a smooth specification object, usually generated by a term s(x,bs="bs",...)

data

a list containing just the data (including any by variable) required by this term, with names corresponding to object$term (and object$by). The by variable is the last element.

knots

a list containing any knots supplied for basis setup — in same order and with same names as data. Can be NULL. See details for further information.

Details

The basis and penalty are sparse (although sparse matrices are not used to represent them). m[2]>m[1] will generate an error, since in that case the penalty would be based on an undefined derivative of the basis, which makes no sense.

The default basis dimension, k, is the larger of 10 and m. m[1] is the lower limit on basis dimension. If knots are supplied, then the number of supplied knots should be k + m[1] + 1, and the range of the middle k-m[1]+1 knots should include all the covariate values. Alternatively, 2 knots can be supplied, denoting the lower and upper limits between which the spline can be evaluated (Don't make this range too wide, however, or you can end up with no information about some basis coefficients, because the corresponding basis functions have a span that includes no data!). Unlike P-splines, splines with derivative based penalties can have uneven knot spacing, without a problem.

Linear extrapolation is used for prediction that requires extrapolation (i.e. prediction outside the range of the interior k-m[1]+1 knots). Such extrapolation is not allowed in basis construction, but is when predicting.

It is possible to set a deriv flag in a smooth specification or smooth object, so that a model or prediction matrix produces the requested derivative of the spline, rather than evaluating it.

Value

An object of class "Bspline.smooth". See smooth.construct, for the elements that this object will contain.

WARNING

m directly controls the spline order here, which is intuitively sensible, but different to other bases.

Author(s)

Simon N. Wood simon.wood@r-project.org

References

Wood, S.N. (2016) P-splines with derivative based penalties and tensor product smoothing of unevenly distributed data. Statistics and Computing. http://arxiv.org/abs/1605.02446

See Also

p.spline

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
  require(mgcv)
  set.seed(5)
  dat <- gamSim(1,n=400,dist="normal",scale=2)
  bs <- "bs"
  b <- gam(y~s(x0,bs=bs,m=c(4,2))+s(x1,bs=bs)+s(x2,k=15,bs=bs,m=c(4,3))+
           s(x3,bs=bs,m=c(1,0)),data=dat,method="REML")
  plot(b,pages=1)

  ## construct smooth of x. Model matrix sm$X and penalty 
  ## matrix sm$S[[1]] will have many zero entries...
  x <- seq(0,1,length=100)
  sm <- smoothCon(s(x,bs="bs"),data.frame(x))[[1]]

  ## another example checking penalty numerically...
  m <- c(4,2); k <- 15; b <- runif(k)
  sm <- smoothCon(s(x,bs="bs",m=m,k=k),data.frame(x),scale.penalty=FALSE)[[1]]
  sm$deriv <- m[2]
  h0 <- 1e-3; xk <- sm$knots[(m[1]+1):(k+1)]
  Xp <- PredictMat(sm,data.frame(x=seq(xk[1]+h0/2,max(xk)-h0/2,h0)))
  sum((Xp%*%b)^2*h0) ## numerical approximation to penalty
  b%*%sm$S[[1]]%*%b  ## `exact' version

  ## ...repeated with uneven knot spacing...
  m <- c(4,2); k <- 15; b <- runif(k)
  ## produce the required 20 unevenly spaced knots...
  knots <- data.frame(x=c(-.4,-.3,-.2,-.1,-.001,.05,.15,
        .21,.3,.32,.4,.6,.65,.75,.9,1.001,1.1,1.2,1.3,1.4))
  sm <- smoothCon(s(x,bs="bs",m=m,k=k),data.frame(x),
        knots=knots,scale.penalty=FALSE)[[1]]
  sm$deriv <- m[2]
  h0 <- 1e-3; xk <- sm$knots[(m[1]+1):(k+1)]
  Xp <- PredictMat(sm,data.frame(x=seq(xk[1]+h0/2,max(xk)-h0/2,h0)))
  sum((Xp%*%b)^2*h0) ## numerical approximation to penalty
  b%*%sm$S[[1]]%*%b  ## `exact' version

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.