In gmonette/spida2: Collection of tools developed for the Summer Programme in Data Analysis 2000-2012

library(spida2)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

# library(carEx)
library(spida2)
library(magrittr)
library(latticeExtra)

sp <- gspline(0)
Dmat <- environment(sp)$Dmat
Emat <- environment(sp)$Emat
basis <- environment(sp)$basis
Xmat <- environment(sp)$Xmat
Cmat <- environment(sp)$Cmat
Pcon <- environment(sp)$Pcon
Xf <- environment(sp)$Xf

Introduction

The parametric polynomial splines implemented in the 'carEx' package are piecewise polynomial functions on $k+1$ intervals formed by $k$ knots partitioning the real line: $$(-\infty, t_1], (t_1, t_2], ..., (t_{i-1}, t_i], ..., (t_k, \infty)$$ with degree $d_i$ on the $i$th interval $(t_{i-1}, t_i]$, $i = 1,...,k+1$, and order of continuity $c_i$ at the $i$th knot, $i = 1,...,k$.

The order of continuity refers to the highest order for which the derivatives of the polynomial on the interval to the left and to the right of a knot, $t_i$, have the same limits at $t_i$. For all orders above $c_i$, derivatives, if any, are not constrained to have the same limit.

Such a spline is parametrized by three vectors: a vector of knots, $t_1 < t_2 < ... < t_k$, of length $k > 0$, a vector of polynomial degrees, $d_1, d_2, ... , d_{k+1}$, of length $k+1$, and a vector of orders of continuity or 'smoothness', $c_1, c_2, ... , c_{k}$, of length $k$.

Theory

We first describe the general principles that underly the implemention of splines in this package.

Let $X_f$ be a $n \times q$ matrix for a model whose coefficients are subject to $c$ linearly independent constraints given by a $c \times q$ matrix $C$. That is, the linear space for the model is: $$\mathcal{M} = { \eta = X_f \phi : \phi \in \mathbb{R}^q, C \phi = 0 }$$ We wish to construct a $n \times p$ design matrix $X$ with $p=q-c$ so that $$\mathcal{M} = { \eta = X \beta : \beta \in \mathbb{R}^p}$$ Suppose further that we want the parameters $\beta$ to provide $p$ specified linearly independent functions of $\phi$ represented by the rows of the $p \times q$ matrix $E$ whose rows are linearly independent of the rows of $C$ to ensure that they are not equal to 0 on $\mathcal{M}$.

Then the $q \times q$ partitioned matrix $\left[\begin{array}{cc} C \ E \end{array} \right]$ has linearly independent rows and is invertible with a conformably partitioned inverse: $$\left[\begin{array}{cc} F & G \end{array} \right] = \left[\begin{array}{cc} C \ E \end{array} \right]^{-1}$$ Thus $FC + GE = I$, $CF = I$, etc.

Consider the model matrix $X = X_f G$. We show that $\mathcal{M} = {X\beta : \beta \in \mathbb{R}^p}$ and that for any $\phi \in \mathbb{R}^q$, such that $C\phi =0$, $\beta = E \phi$.

Suppose $C \phi = 0$. Then

$$\phi = \left[\begin{array}{cc} F & G \end{array} \right] \left[\begin{array}{cc} C \ E \end{array} \right] \phi = \left[\begin{array}{cc} F & G \end{array} \right] \left[\begin{array}{cc} 0 \ E \phi \end{array} \right] = GE \phi$$

Thus, with $\beta = E \phi$, we have

$$ X_f \phi = X_f GE \phi = X \beta$$

If $X$ is of full rank, this defines a 1-1 correspondence between $\beta \in \mathbb{R}^p$ and ${\phi \in \mathbb{R}^q : C \phi = 0}$ given by $\beta = E \phi$ and $\phi = G\beta$.

We can obtain the least-squares estimator $\hat{\beta} = (X'X)^{-1}X'Y$. We can then estimate any linear function $\psi = L\phi$ of $\phi$ under the constraint $C \phi = 0$ with the estimator $\hat{\psi} = A\hat{\beta}$ with $$A = LG$$ Thus, the matrix $G$ serves as a post-multiplier to transform $X_f$ into a model matrix $X=X_f G$ that can be used in a linear model.

The same matrix $G$ also serves as a post-multiplier to transform any general linear hypothesis matrix expressed in terms of $\phi$ into a general linear hypothesis matrix in terms of of $\beta$.

Application to Splines

Our goal is to generate model matrices for splines in a way that produces interpretable coefficients and lends itself to easily estimating and testing properties of the spline that are linear functions of parameters: slope, curvature, discontinuities, etc.

Given $k$ knots, $-\infty = t_0 < t_1 < ... < t_k < t_{k+1} = \infty$, the spline in the $i$th interval, $(t_{i-1},t_i]$, is a polynomial of degree $d_i$, a non-negative integer with the value 0 signifying a constant over the corresponding interval.

The order of smoothness $c_i$ at $t_i$ is either a non-negative integer or -1 to allow a discontinuity.

Generating a model matrix for piecewise polynomial functions is sometimes simple. For example, if the degrees, $d_i$, are non-decreasing and the order of continuity is a constant $c$ less than $\min(d_i)$, one can add terms using 'plus' functions at each knot. For example, a quadratic spline (degree 2, continuity 1) with one knot at 1 can be generated with a model matrix with three columns, in addition to the intercept term: $$x, x^2, (x -1)+^2$$ where $$(y)+ = \left{ \begin{array}{ll} 0 & \textrm{if } y < 0 \ y & \textrm{otherwise } \end{array}\right.$$ A spline that is quadratic on the interval $(-\infty,1]$ and cubic on $(1,\infty)$ with continuity of order 1, $c_1 = 1$, at $t_1 = 1$, can be generated by the columns: $$x, x^2, (x -1)+^2, (x -1)+^3$$ However, if one allows the degree of the polynomial or the order of smoothness to vary in different parts of the spline, the approach above works only in special cases.

Generating model matrices in more general situations, for example with degrees that are not monotone, nor monotone increasing as the index radiates from a central value, is more challenging. The approach described here works for any pattern of degrees, $d_i$ and smoothness constraints, $c_i$.

We start by constucting a matrix, $X_f$, for a spline in which the polynomial degree in each interval is the maximal value, $\max(d_i)$. We then construct constraints for the coefficients of this model to produce the desired spline.

As an example, consider a spline, $\mathcal{S}$, with knots at 3 and 7, polynomial degrees, $(2, 3, 2)$, and smoothness, $(1, 2)$, meaning that $\mathcal{S}$ is smooth of order 1 at $x=3$, and smooth of order 2 at $x=7$. Columns of the full matrix $X_f$ contain the intercept, linear and quadratic and cubic terms in each interval of the spline.

To create an instance of $X_f$ we need to specify the values over which the matrix is evaluated. Evaluating $X_f$ at $x = 0, 1, ... 9$, we obtain the following matrix, which happens here to be block diagonal because of the ordering of the $x$ values:

Xf(0:9, knots = c(3,7), degree = 3)

The model for the unconstrained maximal polynomial is $X_f \phi: \phi \in \mathbb{R}^{12}$.

We impose three types of constraints on $\phi$.

$X_f \phi$ should evaluate to 0 at $x=0$ so an intercept term in the model will have the correct interpretation,
the limits of the value and of the first derivative of the spline must be the same when approaching the first knot from the right or from the left, and the limits of the value, the first and second derivatives should be the same when approaching the second knot from the right or from the left, and
the degree of the polynomial in the first and third intervals must be reduced to 2.

The constraint marix, $C$ is created by the 'Cmat' function:

Cmat(knots = c(3, 7), degree = c(2, 3, 2), smooth = c(1, 2))

The row labels of the constraint matrix show the role of each row. For example, "f(0)" is the value of the spline when $x=0$ which is constrained to 0 so that an intercept term in a linear model can have its usual interpretation, "C0|3" ensures continuity at $x=3$, "C2|7" forces continuity of the second derivative at $x = 7$, "I.1.3" constrains the cubic term to be 0 in the first interval, etc.

The 'd' attribute contains the vector of singular values of the constraint matrix.

The following is the matrix $E$ of estimable functions created by the 'Emat' function:

Emat(knots = c(3, 7), degree = c(2, 3, 2), smooth = c(1, 2))

The row labels signify the first derivative at $x=0$, 'D1|0', the second derivative at $x=0$, 'D2|0', the saltus in the second derivative at $x = 3$, "C2|3" and the saltus in the third derivative at $x=3$, "C3|3".

The full rank model for the spline is generated by a matrix $X= X_f G$ as described in the previous section.

The spline modelling function is a closure generated by the gspline function.

sp <- gspline(knots = c(3, 7), degree = c(2, 3, 2), smoothness = c(1, 2))
sp(0:9)

produce a matrix $X = X_fG$ that will generate the desired spline parametrized by linear estimable coefficients.

The closure created by the gspline function can be used in a linear model formulas. We illustrate its use with a small example. Note that the spline function can be used in any linear model formula. It can, for example, be modelled as interacting with other predictors.

df <- data.frame(x = 0:10)
set.seed(123)
df <- within(df, y <- -2* (x-5) + .1 * (x-5)^3 + rnorm(x))
df <- rbind(df, data.frame(x = seq(0,10,.1), y = NA))
df <- sortdf(df, ~ x)
plot(y~x, df, pch = 16)
fit <- lm(y ~ sp(x), data = df)
summary(fit)
lines(df$x , predict(fit, df))

Linear hypotheses

Linear hypotheses about a spline may be easy to formulate in terms of its 'full' parameter vector $\phi$ but challenging in terms of the 'working' parameters, $\beta$. For example, the derivative or curvature of the spline over a range of values is easily expressed in terms of $\phi$. To do this We use the relationship between linear hypotheses in terms of $\phi$ with those in terms of $\beta$ to generate linear hypotheses based on $\hat{\beta}$. Namely the least-squares estimator of $\psi = L \phi$ under the contraint $C \phi =0$ is $\hat{\psi} = A \hat{\beta}$ where $A = LG$.

Given a spline function sp created by the gspline function:

sp <- gspline(knots = c(3,7), degree = c(2,3,2), smoothness = c(1,2))
sp(0:9)

The sp function will generate a hypothesis matrix to query values and derivatives of the spline.

sp(c(2, 3, 7), D = 1)

Denoting the matrix above by $A$, $A \hat{\beta}$ will estimate the first derivative of the spline at $x=2$ and its limit from the right at the knots $x = 3, 7$. The limit parameter to the spline function is used to select whether the value estimated is a limit from the right, from the left, or the saltus (jump) in value if discontinuous. For example, at $x=3$ where the spline has a discontinuous second derivatives:

sp(c(3, 3, 3), D = 2, limit = c(-1,0,1))

Using the 'wald' function it is possible to graph these estimates as a function of of $x$.

xpred <- seq(0,10, .05)

A.1 <- cbind(0, sp(xpred, D = 1))
ww.1 <- as.data.frame(wald(fit, A.1))

A.2 <- cbind(0, sp(xpred, D = 2))
ww.2 <- as.data.frame(wald(fit, A.2))

plot(xpred, ww.1$coef, type = 'l')
plot(xpred, ww.2$coef, type = 'l')
library(latticeExtra)
ww.1$x <- xpred
xyplot(coef ~ x, ww.1, type = 'l',
       lower = ww.1$L2, upper = ww.1$U2,
       ylab = 'first derivative',
       subscripts = TRUE) +
    layer(gpanel.fit(...))
head(ww.1)

Discontinuity

We use the monthly U.S. unemployment rates from January 1995 to February 2019 to illustrate a model with a discontinuity and, subsequently, we will show a periodic spline component can be added to model periodic patterns such as annual seasonal patterns.

The 'crash' in November 2008 creates a discontinuity in the series which we will treat as an 'a priori' discontinuity.

unemp <- read.csv('http://blackwell.math.yorku.ca/data/USUnemployment.csv')
unemp$date <- as.Date(unemp$date)
head(unemp)
library(lattice)
library(latticeExtra)
xyplot(unemployment ~ date, unemp) + layer(panel.abline(v = as.Date('2008-12-15', col = 'blue')))

toyear <- function(x) {
  # number of years from January 1, 2000
  (as.numeric(x) - as.numeric(as.Date('2000-01-01')))/365.25
}
unemp <- within(
  unemp,
  {
    year <- toyear(date)
    month <- as.numeric(format(date, '%m'))
  })
summary(unemp)

The following code creates a quadratic spline and a cubic spline with knots at quintiles.

quintiles <- quantile(unemp$year, 1:4/5)
sp2 <- gspline(quintiles, 2, 1) # quadratic spline
sp3 <- gspline(quintiles, 3, 2) # cubic spline

We can also add a knot at the point of discontinuity coincident with the 2008 crash.

quintiles_with_crash <- sort(c(quintiles, toyear(as.Date('2008-12-15'))))

sp2d <- gspline(quintiles_with_crash, 2, c(1,1,-1,1,1))
sp3d <- gspline(quintiles_with_crash, 3,  c(2,2,-1,2,2))

The following code fits four models using a quadratic or cubic spline with or without a discontinuity.

fit2 <- lm(unemployment ~ sp2(year), unemp)
summary(fit2)
unemp$fit2 <- predict(fit2)
fit3 <- lm(unemployment ~ sp3(year), unemp)
summary(fit3)
unemp$fit3 <- predict(fit3)

fit2d <- lm(unemployment ~ sp2d(year), unemp)
summary(fit2d)
unemp$fit2d <- predict(fit2d)
fit3d <- lm(unemployment ~ sp3d(year), unemp)
summary(fit3d)
unemp$fit3d <- predict(fit3d)

pp <- xyplot(unemployment ~ date, unemp, type = 'b',
             col = 'gray',
             key = list(
               corner = c(1,1),
               text = list(lab = c('quadratic','cubic','quadratic discontinuous','cubic discontinuous')),
               lines = list(col= c('red','blue','red','blue'),
                            lty = c(3,3,2,1))
               )) +
  layer(panel.lines(x, unemp$fit3, col = 'blue', lty = 3)) +
  layer(panel.lines(x, unemp$fit2, col = 'red', lty = 3)) + 
  layer(panel.lines(x, unemp$fit3d, col = 'blue', lty = 1)) +
  layer(panel.lines(x, unemp$fit2d, col = 'red', lty = 2)) 
pp

The cubic model follows the data better but overestimates in the vicinity of 2000 and 2004. It also misses an upturn in 2016. The following is a table of AICs for the four models.

AIC(fit2, fit3, fit2d, fit3d)

Periodic spline

We add a periodic spline component as a function of months using a cubic spline with period 12 and four internal knot at months $12 \times (1/5\quad 2/5\quad 3/5\quad 4/5)$. Observe that the derivatives parametrizing the periodic spline are derivatives from the left at the maximum knot, which are identified with the same derivatives from the left at 0.

per3 <- gspline(12 * 1:5/5, 3, 2, periodic = TRUE)
fitper3 <-  lm(unemployment ~ sp3d(year) + per3(month), unemp)
summary(fitper3)
unemp$fitper3 <- predict(fitper3)
pp <- xyplot(unemployment ~ date, unemp, type = 'b',
             col = 'gray') +
  layer(panel.lines(x, unemp$fitper3, col = 'blue'))  
pp

We can examine the monthly periodic spline fit:

pred <- data.frame(month = seq(0,24,.01), year = 0)
pred$fitper3 <- predict(fitper3, newdata = pred)
xyplot(fitper3 ~ month, pred, type = 'l',
       xlim = c(0,24),
       scales = list( x = list( at =3 * 0:8)),
       ylab = 'seasonal unemployment') +
  layer(panel.abline(v = 12))

This cubic periodic spline has four free parameters determined by the first three derivatives from the left at 0, and the jumps in the third derivative at any one of the knots. The third derivative is not continuous at any know as the following plot of derivatives shows.

derivs <- expand.grid(month = seq(0, 24, .01), D = 1:3)
Ld <- with(derivs, per3(month, D = D, limit = -1))
Ld <- cbind(0*Ld[,rep(1,12)], Ld)
derivs <- cbind(derivs, walddf(fitper3, Ld))
derivs $ order <- 
  factor(c('first','second','third')[derivs$D])
derivs $ order <- with(derivs, reorder(order, D))
inds <- which(derivs$month %% (12/5) < .0001 & derivs$D == 3)
derivs$coef[inds] <- NA
xyplot(coef ~ month | order , derivs, type = 'l', layout = c(1,3),
       xlim = c(0,24),
       ylab = 'derivative',
       strip.left = T, strip = F) 


Ldi <- per3(seq(0,24,12/10), D = 3)

Now, using Lfx:

summary(fitper3)
Lfx(fitper3)
derivs$year <- 0
ww <- walddf(
  fitper3, 
  Lfx(fitper3,
      list( 0,
0 * M(sp3d(year)),
1 * M(per3(month, D = D))), 
data = derivs))
head(ww)
xyplot(coef ~ month|order, ww) ####################  WORKED !!!!

Does the seasonal pattern change?

We can use an interaction between the seasonal periodic model and the secular model to address whether the seasonal pattern changes over time. To maintain parsimony the interaction can be constructed with a spline with fewer degree of freedom than

sp1d <- gspline(quintiles_with_crash, 1, 0)  
fit_int <- lm(
  unemployment ~ sp3d(year) + per3(month) + year:per3(month), 
  unemp)
summary(fit_int)
car::Anova(fit_int)
wald(fit_int, ':')

There is weak evidence of a change in the seasonal pattern, however, if we wished to visualize the seasonal pattern at different years we can proceed as follows.

Lfx(fit_int)
quintiles_with_crash  
range(unemp$year)
pred <- expand.grid(
  month = seq(0,24,.1), 
  date = as.Date(c('1995-01-01','2002-01-01','2009-01-01','2016-01-01')),
  D = 1)
pred$year <- toyear(pred$date)
ww <- walddf(
  fit_int,
  Lfx(fit_int,
      list( 0,
            0 * M(sp3d(year)),
            1 * M(per3(month)),
            1 * M(per3(month)) * year 
      ), pred)
)

xyplot(coef ~ month | sub("-01-01","",date), ww,
       type = 'l',
       xlim = c(0,24),
       scales = list(
         x = list(
           at = c(0,4,8,12,16,20,24),
           labels = c('Jan','May','Sep','Jan','May','Sep','Jan'))),
       main = 'Seasonal component of U.S. unemployment')

unemp$fit_int <- predict(fit_int, unemp)
###################################################################### REDO
pp <- xyplot(unemployment ~ date, unemp, type = 'b',
             col = 'gray',
             key = list(
               corner = c(1,1),
               text = list(lab = c('quadratic','cubic','quadratic disc.','cubic disc.')),
               lines = list(col= c('red','blue','red','blue'),
                            lty = c(1,1,3,3))
               )) +
  layer(panel.lines(x, unemp$fit3d, col = 'blue', lty = 2)) +
  layer(panel.lines(x, unemp$fit_int, col = 'red', lty = 2)) 
pp

# diagnostics
plot(fit_int)
unemp[169,]  # as expected
library(nlme)
fit_int_ar <- 
  gls(
    unemployment ~ sp3d(year) + per3(month) + year:per3(month), 
    unemp, correlation = corAR1(form = ~ 1))
summary(fit_int_ar)
anova(fit_int_ar, fit_int)

car::Anova(fit_int_ar)
wald(fit_int, ':')
fit_int_ar2 <- update(fit_int_ar, correlation = corARMA(form = ~ 1, p = 2, q = 0))
anova( fit_int_ar2, fit_int_ar, fit_int)

Comparison with a Fourier model

Sin <- function(x) cbind(sin=sin(2*pi*x),cos=cos(2*pi*x))
fit_fourier_int <- gls(
  unemployment ~ sp3d(year) + Sin(month/12) + Sin(2*month/12) + Sin(3*month/12)+
    Sin(4*month/12) +
    year:(Sin(month/12) + Sin(2*month/12)), 
  unemp, correlation = corAR1(form = ~ 1))
car::Anova(fit_fourier_int)
AIC(fit_fourier_int, fit_int_ar)


fit_factor_int <- lm(
  unemployment ~ sp3d(year) + factor(month) + year:factor(month),
  unemp)

car::Anova(fit_factor_int)
anova(fit_factor_int, fit_int)