penalty: Wiggliness penalties for penalized B-splines

penaltyR Documentation

Wiggliness penalties for penalized B-splines

Description

For penalized B-splines (including standard or general P-splines and O-splines), (1) construct matrix \bm{D} in the wiggliness penalty \|\bm{D\beta}\|^2; (2) sample B-spline coefficients from their prior distribution \textrm{N}(\bm{0},\ (\bm{D'D})^-); (3) compute the Moore-Penrose generalized inverse matrix (\bm{D'D})^-.

Usage

SparseD(xt, d, m = NULL, gps = TRUE)

PriorCoef(n, D)

MPinv(D, only.diag = FALSE)

Arguments

xt

full knot sequence for ordinary B-splines (length(xt) >= 2 * d).

d

B-spline order (d \ge 2).

m

penalty order (1 \le m \le d - 1). Can be a vector of multiple values for SparseD.

gps

if TRUE, return \bm{D}_{\textrm{gps}}; if FALSE, return \bm{D}_{\textrm{os}}.

n

number of samples to draw from the prior distribution.

D

matrix \bm{D}_{\textrm{gps}} or \bm{D}_{\textrm{os}}.

only.diag

if TURE, only diagonal elements are computed.

Details

General Difference Penalty for General P-Splines

A general P-spline is characterized by an order-m general difference matrix \bm{D}_{\textrm{gps}}, which can be computed by SparseD(..., gps = TRUE). For interpretation, the differenced coefficients \bm{D}_{\textrm{gps}}\bm{\beta} are in fact f^{(m)}(x)'s B-spline coefficients, so the penalty is their squared L_2 norm.

Derivative Penalty for O-Splines

An O-spline is characterized by \bm{D}_{\textrm{os}} such that \|\bm{D}_{\textrm{os}}\bm{\beta}\|^2 = \int_a^b f^{(m)}(x)^2\textrm{d}x. Since f^{(m)}(x) has B-spline coefficients \bm{D}_{\textrm{gps}}\bm{\beta}, the integral can be shown to be \bm{\beta'}\bm{D}_{\textrm{gps}}'\bm{\bar{S}}\bm{D}_{\textrm{gps}}\bm{\beta}, where \bm{\bar{S}} is the Gram matrix of those B-splines representing f^{(m)}(x). Following the Cholesky factorization \bm{\bar{S}} = \bm{U'U}, the quadratic form becomes \|\bm{U}\bm{D}_{\textrm{gps}}\bm{\beta}\|^2, so that \bm{D}_{\textrm{os}} = \bm{U}\bm{D}_{\textrm{gps}}. This matrix can be computed by SparseD(..., gps = FALSE), with \bm{\bar{S}} and \bm{D}_{\textrm{gps}} also returned in a "sandwich" attribute.

Penalty Matrix

We can express the L_2 penalty \|\bm{D\beta}\|^2 as quadratic form \bm{\beta'S\beta}, where \bm{S} = \bm{D'D} is called a penalty matrix. It is trivial to compute \bm{S} (using function crossprod) once \bm{D} is available, so we don't feel the need to provide a function for this. Note that the link between \bm{D}_{\textrm{os}} and \bm{D}_{\textrm{gps}} implies a sandwich formula \bm{S}_{\textrm{os}} = \bm{D}_{\textrm{gps}}'\bm{\bar{S}}\bm{D}_{\textrm{gps}}, wherease \bm{S}_{\textrm{gps}} = \bm{D}_{\textrm{gps}}'\bm{D}_{\textrm{gps}}.

The Bayesian View

In the Bayesian view, the penalty \bm{\beta'S\beta} is a Gaussian prior for B-spline coefficients \bm{\beta}. But it is an improper one because \bm{S} has a null space where an unpenalized order-m polynomial lies. Let's decompose \bm{\beta} = \bm{\xi} + \bm{\theta}, where \bm{\xi} (the projection of \bm{\beta} on this null space) is the coefficients of this order-m polynomial, and \bm{\theta} (orthogonal to \bm{\xi}) is the component that can be shrunk to zero by the penalty. As a result, \bm{\xi} \propto \bm{1} is not proper, but \bm{\theta} \sim \textrm{N}(\bm{0},\ \bm{S}^-) is. Function PriorCoef samples this distribution, and the resulting B-spline coefficients can be used to create random spline curves. The algorithm behind PriorCoef bypasses the Moore-Penrose generalized inverse and is very efficient. We don't recommend forming this inverse matrix because it, being completely dense, is expensive to compute and store. But if we need it anyway, it can be computed using function MPinv.

Value

SparseD returns a list of sparse matrices (of "dgCMatrix" class), giving \bm{D}_{\textrm{gps}} or \bm{D}_{\textrm{os}} of order m[1], m[2], ..., m[length(m)]. In the latter case, \bm{\bar{S}} (sparse matrices of "dsCMatrix" or "ddiMatrix" class) and \bm{D}_{\textrm{gps}} for computing \bm{D}_{\textrm{os}} are also returned in a "sandwich" attribute.

PriorCoef returns a list of two components:

  • coef gives a vector of B-spline coefficients when n = 1, or a matrix of n columns when n > 1, where each column is an independent sample;

  • sigma is a vector, giving the marginal standard deviation for each B-spline coefficient.

MPinv returns the dense Moore-Penrose generalized inverse matrix \bm{(D'D})^- if only.diag = FALSE, and the diagonal entries of this matrix if only.diag = TRUE.

Author(s)

Zheyuan Li zheyuan.li@bath.edu

References

Zheyuan Li and Jiguo Cao (2022). General P-splines for non-uniform splines, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2201.06808")}

Examples

require(Matrix)
require(gps)

## 11 domain knots at equal quantiles of Beta(3, 3) distribution
xd <- qbeta(seq.int(0, 1, by = 0.1), 3, 3)
## full knots (with clamped boundary knots) for constructing cubic B-splines
xt <- c(0, 0, 0, xd, 1, 1, 1)

## compute D matrices of order 1 to 3 for O-splines
D.os <- SparseD(xt, d = 4, gps = FALSE)
D1.os <- D.os[[1]]; D2.os <- D.os[[2]]; D3.os <- D.os[[3]]

## get D matrices of order 1 to 3 for general P-splines
## we can of course compute them with D.gps <- SparseD(xt, d = 4, gps = TRUE)
## but they are readily stored in the "sandwich" attribute of 'D.os'
D.gps <- attr(D.os, "sandwich")$D
D1.gps <- D.gps[[1]]; D2.gps <- D.gps[[2]]; D3.gps <- D.gps[[3]]

## we can compute the penalty matrix S = D'D
S.gps <- lapply(D.gps, crossprod)
S1.gps <- S.gps[[1]]; S2.gps <- S.gps[[2]]; S3.gps <- S.gps[[3]]
S.os <- lapply(D.os, crossprod)
S1.os <- S.os[[1]]; S2.os <- S.os[[2]]; S3.os <- S.os[[3]]

## if we want to verify the sandwich formula for O-splines
## extract 'Sbar' matrices stored in the "sandwich" attribute
## and compute the relative error between S and t(D) %*% Sbar %*% D
Sbar <- attr(D.os, "sandwich")$Sbar
Sbar1 <- Sbar[[1]]; Sbar2 <- Sbar[[2]]; Sbar3 <- Sbar[[3]]
range(S1.os - t(D1.gps) %*% Sbar1 %*% D1.gps) / max(abs(S1.os))
range(S2.os - t(D2.gps) %*% Sbar2 %*% D2.gps) / max(abs(S2.os))
range(S3.os - t(D3.gps) %*% Sbar3 %*% D3.gps) / max(abs(S3.os))

## sample B-spline coefficients from their prior distribution
b.gps <- PriorCoef(n = 5, D2.gps)$coef
b.os <- PriorCoef(n = 5, D2.os)$coef
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5), oma = c(0, 0, 1, 0))
## prior B-spline coefficients with a general difference penalty
matplot(b.gps, type = "l", lty = 1, ann = FALSE)
title("general difference penalty")
## prior B-spline coefficients with a derivative penalty
matplot(b.os, type = "l", lty = 1, ann = FALSE)
title("derivative penalty")
title("random B-spline coefficients from their prior", outer = TRUE)
par(op)

## plot the corresponding cubic splines with these B-spline coefficients
x <- MakeGrid(xd, n = 11)
B <- splines::splineDesign(xt, x, ord = 4, sparse = TRUE)
y.gps <- B %*% b.gps
y.os <- B %*% b.os
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5), oma = c(0, 0, 1, 0))
matplot(x, y.gps, type = "l", lty = 1, ann = FALSE)
title("general difference penalty")
matplot(x, y.os, type = "l", lty = 1, ann = FALSE)
title("derivative penalty")
title("random cubic splines with prior B-spline coefficients", outer = TRUE)
par(op)

gps documentation built on Nov. 2, 2023, 6:08 p.m.

Related to penalty in gps...