README.md

splintr

check-standard DOI Licence Lifecycle codecov

Natural cubic splines with interpretable intercepts: ‘centres’ a basis generated using splines::ns() on a specified x-value. When used in a model formula, this allows the model intercept to be interpreted with respect to that central x-value, rather than with respect to the x-value of the first splines::ns() knot.

Installation

You can install splintr from github with:

# install.packages("devtools")
devtools::install_github("simisc/splintr")

Example

library(tidyverse)
library(splines)
library(splintr)
library(broom)
library(knitr)
library(ggthemes)
library(ggrepel)
women <- women %>%
  mutate(height_centred = height - mean(height))
women %>%
  head() %>%
  kable()

| height | weight | height_centred | | -----: | -----: | --------------: | | 58 | 115 | -7 | | 59 | 117 | -6 | | 60 | 120 | -5 | | 61 | 123 | -4 | | 62 | 126 | -3 | | 63 | 129 | -2 |

Different parametrisations of the same model:

model_formulae <- list(
  fit0 = weight ~ ns(height, df = 3),
  fit1 = weight ~ ns(height_centred, df = 3),
  fit2 = weight ~ splintr(height_centred, df = 3),
  fit3 = weight ~ splintr(height, df = 3, centre = explicit_centre)
)

explicit_centre = 68.45
model_fits <- map(model_formulae, lm, data = women)

The models are identical:

model_fits %>%
  map_dfr(glance, .id = "model") %>%
  select(model, sigma, logLik, deviance) %>%
  kable()

| model | sigma | logLik | deviance | | :---- | -------: | ---------: | -------: | | fit0 | 0.321018 | -1.914046 | 1.133578 | | fit1 | 0.321018 | -1.914046 | 1.133578 | | fit2 | 0.321018 | -1.914046 | 1.133578 | | fit3 | 0.321018 | -1.914046 | 1.133578 |

But they have different intercepts:

model_fits %>%
  map_dfr(~tidy(.x, quick = TRUE), .id = "model") %>%
  filter(term == "(Intercept)") %>%
  select(model, estimate, std.error, statistic) %>%
  kable()

| model | estimate | std.error | statistic | | :---- | -------: | --------: | --------: | | fit0 | 114.5595 | 0.2455372 | 466.5666 | | fit1 | 114.5595 | 0.2455372 | 466.5666 | | fit2 | 135.1067 | 0.1288238 | 1048.7711 | | fit3 | 147.8057 | 0.1513002 | 976.9034 |

Using splines::ns(), the model intercept represents the estimated value of weight at the first boundary knot, i.e. when height takes its minimum value of 58. This is unchanged by centring the predictor, so fit0 and fit1 have identical intercepts.

Using splintr() instead, the intercept representes the estimated value of weight when the predictor height_centred takes a value of zero. An arbitrary “centre” can be specified directly in the splintr() call.

int_pts <- tibble(
  model = paste0("fit", 0:3),
  x = c(
    min(women$height),
    min(women$height),
    mean(women$height),
    explicit_centre
  ),
  y = map_dbl(model_fits, ~coef(.x)[1])
)

augment(model_fits[[1]], data = women) %>%
  ggplot(aes(x = height)) +
  geom_point(aes(y = weight)) +
  geom_line(aes(y = .fitted), col = "blue") +
  geom_point(data = int_pts, aes(x = x, y = y), col = "red") +
  geom_label_repel(data = int_pts, aes(x = x, y = y, label = model),
                   col = "red", nudge_y = 7) +
  theme_few()



simisc/splintr documentation built on March 24, 2023, 5:13 p.m.