In AgueroZZ/OSplines: Constructing O Spline for univariate smoothing

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(OSplines)
library(tidyverse)
library(Matrix)
library(TMB)
library(aghq)

Idea of O-Spline approximation

Let $f(t)$ be the unknown function that we want to infer, and assume it is already known that $f$ is at least $p$-1 times continuously differentiable.

Consider the smoothing model: $$\frac{\partial^p{f}(t)}{\partial t^p} = \sigma_sW(t),$$ with the boundary (initial) conditions that $\frac{\partial^q{f}(0)}{\partial t^q} = 0$ for all $0\leq q <p$. Here $W(t)$ is the standard Gaussian white noise process, or can be viewed as the distributional derivative of the standard Brownian motion.

This kind of smoothing model is referred to as Integrated Wiener's process (IWP), and both the O-spline method and the commonly used continuous RW2 method consider its finite dimensional approximation in the following form: $$\tilde{f}(t) = \sum_{i=1}^{k} w_i\varphi_i(t),$$ where $\boldsymbol{w} = (w_1, ..., w_k)^T$ is the vector of Gaussian coefficients, and $\varphi_i$ is basis function defined at the $i$th knot.

The main target of inference is therefore the Gaussian coefficients vector $\boldsymbol{w}$, which will have a diagonal precision matrix under the O-spline construction.

Example of Munich Data

Data and Model

We will illustrate the use of O-spline using the Munich rent data from INLA's Rpackage:

data <- INLA::Munich %>% select(rent, year)
head(data, n = 5)

Let's study the yearly change of rent in Munich, such that $y_i = f(t_i) + \epsilon_i$ where $\epsilon_i \sim N(0,\sigma^2)$. The unknown function is assumed to be at least once continuously differentiable, so the smoothness order is set to $p=2$.

We will use PC priors for both the smoothing parameter $\sigma_s$ and the variance parameter $\sigma$.

Setup the design and precision matrix

Note that the range of $t$ is from $1918$ to $2001$, to make it consistent with the definition of $t$ in the O-spline construction, we should center $t$ so $t$ starts at $0$. This is equivalent to setting the year $1918$ as the reference year. Note if you don't want to set the reference point at the starting point, but instead want to set it at the middle like $1990$, you can center all the years using $1990$ and treat the negative $t$ as another IWP reaching out to the opposite direction.

### Initialization
data$t <- data$year - min(data$year)
length(unique(data$t))

In this case, the length of unique values of $t$ is only 42, so you can just use the observed unique locations as the knots. But in practice if this number is too large, you should choose an equally spaced knot sequence over the region of interest, with much smaller number. Let's do both ways!

### Use the observed locations as knots
knots_observed <- sort(unique(data$t))

### Use equally spaced knots, with 4 years as spacing
knots_equal <- sort(seq(from = min(data$t), to = max(data$t), by = 4))

Note that since O-spline considers the boundary conditions being exactly zero, so it is important to make sure the first value in your knot sequence is actually the "0" that correponds to the reference point.

knots_observed[1]
knots_equal[1]

Given the set of knots, we can easily construct the basis function $\varphi_i$ and the precision matrix of $\boldsymbol{w}$.

The function local_poly will take a sequence of knots and a sequence of points to evaluate these basis functions at, and return a design matrix in this case:

B1 <- local_poly(knots = knots_observed, 
                 refined_x = data$t,
                 p = 2)
B2 <- local_poly(knots = knots_equal, 
                 refined_x = data$t,
                 p = 2)

The precision matrix of the basis coefficients vector $\boldsymbol{w}$ can be obtained using the function compute_weights_precision:

P1 <- compute_weights_precision(x = knots_observed)
P2 <- compute_weights_precision(x = knots_equal)

Note that these the number of row/column of the precision matrix should match to the number of column of the design matrix:

dim(B1)
dim(P1)
dim(B2)
dim(P2)

Before we move to the inference, there will be another step, which is to also consider the boundary conditions. For identifiability purpose, the $p$ boundary conditions at $t=0$ are set to be exactly zero for the O-spline basis. However, in many applications, these boundary conditions are unknown and of inferential interests. Therefore, they will be modeled as global polynomials starting at the reference point $t=0$, with polynomials coefficients assigned with Gaussian priors (or fix to constants if knowledge available). Their corresponding design can be created using global_poly function.

X <- global_poly(x = data$t, p = 2)
dim(X)

Inference

For the inference, let's use the template of TMB/AGHQ:

tmbdat1 <- list(
          # Design matrix
          X = as(X,'dgTMatrix'),
          B = as(B1,'dgTMatrix'),
          P = as(P1,'dgTMatrix'),
          logPdet = as.numeric(determinant(P1,logarithm = T)$modulus),
          # Response
          y = data$rent,
          # PC Prior params
          # (u1, alpha1) for sigma_s
          # (u2, alpha2) for sigma
          u1 = 1,
          alpha1 = 0.5,
          u2 = 1,
          alpha2 = 0.5,
          betaprec = 0.01
        )

tmbparams <- list(
          W = c(rep(0, (ncol(X) + ncol(B1)))), # W = c(U,beta); U = Spline coefficients
          theta1 = 0, # -2log(sigma)
          theta2 = 0
        )

ff <- TMB::MakeADFun(
          data = tmbdat1,
          parameters = tmbparams,
          random = "W",
          DLL = "OSplines",
          silent = TRUE
        )

# Hessian not implemented for RE models
ff$he <- function(w) numDeriv::jacobian(ff$gr,w)
mod1 <- aghq::marginal_laplace_tmb(ff,4,c(0,0))

Also fit the equal spacing setting for comparison:

tmbdat2 <- list(
          # Design matrix
          X = as(X,'dgTMatrix'),
          B = as(B2,'dgTMatrix'),
          P = as(P2,'dgTMatrix'),
          logPdet = as.numeric(determinant(P1,logarithm = T)$modulus),
          # Response
          y = data$rent,
          # PC Prior params
          # (u1, alpha1) for sigma_s
          # (u2, alpha2) for sigma
          u1 = 1,
          alpha1 = 0.5,
          u2 = 1,
          alpha2 = 0.5,
          betaprec = 0.01
        )

tmbparams <- list(
          W = c(rep(0, (ncol(X) + ncol(B2)))), # W = c(U,beta); U = Spline coefficients
          theta1 = 0, # -2log(sigma)
          theta2 = 0
        )

ff <- TMB::MakeADFun(
          data = tmbdat2,
          parameters = tmbparams,
          random = "W",
          DLL = "OSplines",
          silent = TRUE
        )

# Hessian not implemented for RE models
ff$he <- function(w) numDeriv::jacobian(ff$gr,w)
mod2 <- aghq::marginal_laplace_tmb(ff,4,c(0,0))

Now, let's obtain the posterior samples of $\boldsymbol{w}$ for O spline basis and $\boldsymbol{\beta}$ for global polynomials:

samps1 <- sample_marginal(mod1, M = 3000)
global_samps1 <- samps1$samps[(ncol(B1) + 1):nrow(samps1$samps),]
coefsamps1 <- samps1$samps[1:ncol(B1),]

samps2 <- sample_marginal(mod2, M = 3000)
global_samps2 <- samps2$samps[(ncol(B2) + 1):nrow(samps2$samps),]
coefsamps2 <- samps2$samps[1:ncol(B2),]

These basis coefficients completely determine the values of $\tilde{f}$ or its derivative. We can do the posterior conversion using compute_post_fun, and extract_mean_interval_given_samps can compute the posterior mean and pointwise interval given the posterior samples:

f1 <- compute_post_fun(samps = coefsamps1, global_samps = global_samps1, 
                       knots = knots_observed, 
                       refined_x = seq(from = min(data$t), to = max(data$t), by = 1),
                       p = 2, degree = 0)

f2 <- compute_post_fun(samps = coefsamps2, global_samps = global_samps2, 
                       knots = knots_equal, 
                       refined_x = seq(from = min(data$t), to = max(data$t), by = 1),
                       p = 2, degree = 0)


f1pos <- extract_mean_interval_given_samps(f1)
f2pos <- extract_mean_interval_given_samps(f2)

Let's check out the result:

f1pos %>% ggplot(aes(x = x)) + geom_line(aes(y = mean), color = "blue") + 
  geom_ribbon(aes(ymin = plower, ymax = pupper), fill = "orange", alpha = 0.3) +
  geom_point(aes(x = t, y = rent), alpha = 0.1, data = data) +
  theme_classic() + xlab("year since 1981") + ylab("rent") +
  ggtitle("Using observed knots") + ylim(c(5,12))

f2pos %>% ggplot(aes(x = x)) + geom_line(aes(y = mean), color = "blue") + 
  geom_ribbon(aes(ymin = plower, ymax = pupper), fill = "orange", alpha = 0.3) +
  geom_point(aes(x = t, y = rent), alpha = 0.1, data = data) +
  theme_classic() + xlab("year since 1981") + ylab("rent") +
  ggtitle("Using equally spaced knots") + ylim(c(5,12))

Another advantage of O-spline is we can also recover the model-based inference for the derivative of $f$ using posterior samples of $\boldsymbol{w}$, up to order $p-1$. This can be done using compute_post_fun and specify the degree argument:

f1deriv <- compute_post_fun(samps = coefsamps1, global_samps = global_samps1, 
                       knots = knots_observed, 
                       refined_x = seq(from = min(data$t), to = max(data$t), by = 1),
                       p = 2, degree = 1)

f2deriv <- compute_post_fun(samps = coefsamps2, global_samps = global_samps2, 
                       knots = knots_equal, 
                       refined_x = seq(from = min(data$t), to = max(data$t), by = 1),
                       p = 2, degree = 1)


f1derivpos <- extract_mean_interval_given_samps(f1deriv)
f2derivpos <- extract_mean_interval_given_samps(f2deriv)

We can see the result:

f1derivpos %>% ggplot(aes(x = x)) + geom_line(aes(y = mean), color = "blue") + 
  geom_ribbon(aes(ymin = plower, ymax = pupper), fill = "orange", alpha = 0.3) +
  theme_classic() + xlab("year since 1981") + ylab("rent: rate of change") +
  ggtitle("Using observed knots")

f2derivpos %>% ggplot(aes(x = x)) + geom_line(aes(y = mean), color = "blue") + 
  geom_ribbon(aes(ymin = plower, ymax = pupper), fill = "orange", alpha = 0.3) +
  theme_classic() + xlab("year since 1981") + ylab("rent: rate of change") +
  ggtitle("Using equally spaced knots")