If the code in this vignette has not been evaluated, a rendered version is available on the documentation site under 'Articles'.

Residual checking with worked examples

dplyr_installed <- require("dplyr", quietly = TRUE)
ggplot_installed <- require("ggplot2", quietly = TRUE)
glmmTMB_installed <- require("glmmTMB", quietly = TRUE)
DHARMa_installed <- require("DHARMa", quietly = TRUE)
sdmTMBextra_installed <- require("sdmTMBextra", quietly = TRUE)
pkgs <- dplyr_installed && ggplot_installed && glmmTMB_installed && DHARMa_installed && sdmTMBextra_installed 
EVAL <- identical(Sys.getenv("NOT_CRAN"), "true") && pkgs
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.asp = 0.618,
  eval = EVAL,
  purl = EVAL
)
library(sdmTMB)

We will start with some data simulated from scratch. We will simulate from an NB2 negative binomial observation model, a spatial random field, an intercept, and one predictor named 'a1' that will have a linear effect on the observed data.

set.seed(1)
predictor_dat <- data.frame(X = runif(1000), Y = runif(1000), a1 = rnorm(1000))
mesh <- make_mesh(predictor_dat, xy_cols = c("X", "Y"), cutoff = 0.1)
dat <- sdmTMB_simulate(
  formula = ~ 1 + a1,
  data = predictor_dat,
  mesh = mesh,
  family = nbinom2(link = "log"),
  phi = 0.4,
  range = 0.4,
  sigma_O = 0.4,
  seed = 1,
  B = c(0.2, 0.8) # B0 = intercept, B1 = a1 slope
)

Next, we will fit model configurations with various families and predictors. The first model will use the Poisson instead of the NB2. The 2nd model will match the simulated data. The third model is missing the 'a1' predictor. We'll use a penalized complexity (PC) prior on the Matérn parameters to aid in estimation.

pc <- pc_matern(range_gt = 0.1, sigma_lt = 1)

fit_pois <- sdmTMB(observed ~ 1 + a1, data = dat, family = poisson(), mesh = mesh,
  priors = sdmTMBpriors(matern_s = pc))
fit_pois

fit_nb2 <- update(fit_pois, family = nbinom2())
fit_nb2

fit_nb2_miss <- update(fit_nb2, formula. = observed ~ 1)
fit_nb2_miss

We can see just by looking at these fits that the Poisson model inflates the spatial random field standard deviation (SD) compared to the truth. The model missing the 'a1' predictor does so to a lesser degree.

Analytical randomized-quantile residuals

Here are randomized quantile residuals with fixed effect at their MLEs (Maximum Likelihood Estimates) and random effects taken from a single sample of their approximate distribution (more details below):

set.seed(123)
rq_res <- residuals(fit_pois, type = "mle-mvn")
rq_res <- rq_res[is.finite(rq_res)] # some Inf
qqnorm(rq_res);abline(0, 1)

set.seed(123)
rq_res <- residuals(fit_nb2, type = "mle-mvn")
qqnorm(rq_res);abline(0, 1)

These use the randomized quantile approach from Dunn and Smyth (1996). They are also known as PIT (probability-integral-transform) residuals. They apply randomization to integer response values, transform the residuals using the distribution function (e.g., pnorm()) to reflect a uniform(0, 1) distribution, and transform those values such that they would be normal(0, 1) if consistent with the model. You can see the source code at https://github.com/pbs-assess/sdmTMB/blob/master/R/residuals.R

We can see here that there are likely issues with the Poisson model in the tails.

MCMC-based randomized-quantile residuals

The above approach assumes an observation of the random effects can be approximated by a multivariate normal distribution. If we want to relax that assumption, we can sample the random effects with MCMC with the fixed effects held at their MLEs. We do this with the sdmTMBextra::predict_mle_mcmc() function in the sdmTMBextra.

set.seed(123)
samps <- sdmTMBextra::predict_mle_mcmc(fit_nb2, mcmc_iter = 800, mcmc_warmup = 400)
mcmc_res <- residuals(fit_nb2, type = "mle-mcmc", mcmc_samples = samps)
qqnorm(mcmc_res)
abline(0, 1)

Simulation-based randomized-quantile residuals

We can also take simulations from the fitted model to use with simulation-based randomized quantile residuals:

s_pois <- simulate(fit_pois, nsim = 500, type = "mle-mvn")
s_nb2_miss <- simulate(fit_nb2_miss, nsim = 500, type = "mle-mvn")
s_nb2 <- simulate(fit_nb2, nsim = 500, type = "mle-mvn")

These return a matrix where each row represents a row of data and each column is a simulation draw:

dim(s_pois)

We can look at whether fitted models are consistent with the observed number of zeros:

sum(dat$observed == 0) / length(dat$observed)
sum(s_pois == 0)/length(s_pois)
sum(s_nb2 == 0)/length(s_nb2)

There are obviously too few zeros in the data simulated from the Poisson model but the NB2 model seems reasonable.

Plot DHARMa residuals:

dharma_residuals(s_pois, fit_pois)

We could also return the DHARMa object, which lets us use other DHARMa tools:

r_pois <- dharma_residuals(s_pois, fit_pois, return_DHARMa = TRUE)
plot(r_pois)
DHARMa::testResiduals(r_pois)
DHARMa::testSpatialAutocorrelation(r_pois, x = dat$X, y = dat$Y)
DHARMa::testZeroInflation(r_pois)

In the QQ residual plots we clearly see evidence of overdispersion compared to the Poisson. Note the values clumping near 1.0 on the observed axis and deviating downwards towards 0.0 observed. This is indicative of too many zeros and the variance scaling too rapidly with the mean (resulting in some large outlying value) for the Poisson distribution.

Lets try with the correct model:

r_nb2 <- dharma_residuals(s_nb2, fit_nb2, return_DHARMa = TRUE)
plot(r_nb2)
DHARMa::testZeroInflation(r_nb2)

Everything looks fine.

What about the model where we were missing a predictor?

r_nb2_miss <- dharma_residuals(s_nb2_miss, fit_nb2_miss, return_DHARMa = TRUE)
plot(r_nb2_miss)

The plot on the right represents simulated residuals against the prediction without the random effects, which here is just an intercept. Lets try plotting the residuals against the missing predictor:

DHARMa::plotResiduals(r_nb2_miss, form = dat$a1)

We can see a trend in the residuals against 'a1' since we have missed including it in the model.

We can also see the difference in the log likelihood or the AIC:

AIC(fit_nb2_miss, fit_nb2)

AIC also supports including the 'a1' predictor.

For help interpreting the DHARMa residual plots, see vignette("DHARMa", package="DHARMa").

The need for one-sample residuals

The above used random effects drawn as if they were observed once.

The approach is described in Waagepetersen (2006) and is summarized nicely in an unexported function oneSamplePosterior within TMB. Thygesen et al. (2017) also describes them in the context of one-sample MCMC residuals.

Here we will show why this is necessary.

We'll start by simulating some data with Gaussian observation error and spatial and spatiotemporal random effects.

set.seed(123)
predictor_dat <- data.frame(
  X = runif(1000), Y = runif(1000),
  year = rep(1:5, each = 200)
)
mesh <- make_mesh(predictor_dat, xy_cols = c("X", "Y"), cutoff = 0.1)
sim_dat <- sdmTMB_simulate(
  formula = ~ 1,
  data = predictor_dat,
  time = "year",
  mesh = mesh,
  family = gaussian(),
  range = 0.3,
  sigma_E = 0.3,
  phi = 0.1,
  sigma_O = 0.4,
  seed = 1,
  B = 0.2 # intercept
)
fit <- sdmTMB(observed ~ 1, data = sim_dat, time = "year", mesh = mesh)

If we use the empirical Bayes (EB) random effect values (the values of the random effects that maximize the log likelihood conditional on the estimated fixed effects), our residuals look off even though our model is perfectly matched to our simulated data:

set.seed(1)
r1 <- residuals(fit, type = "mle-eb")
qqnorm(r1);abline(0, 1)
ks.test(r1, pnorm)

Indeed, our test (incorrectly) rejects the null hypothesis that $r_1 \sim \operatorname{N}(0, 1)$, when if calculated correctly we know they do come from $\operatorname{N}(0, 1)$

If instead we returned to our single sample from the assumed MVN random effect distribution, we get the 'correct' residuals:

set.seed(1)
r2 <- residuals(fit, type = "mle-mvn")
qqnorm(r2);abline(0, 1)
ks.test(r2, pnorm)

Here, we would (correctly) fail to reject the hypothesis that $r_2 \sim \operatorname{N}(0, 1)$.

We could also sample that observation of random effects using MCMC (with the fixed effects still held at their MLEs), which relaxes our assumptions, but is much more time intensive for large models.

samp <- sdmTMBextra::predict_mle_mcmc(fit, mcmc_iter = 400, mcmc_warmup = 200)
r3 <- residuals(fit, type = "mle-mcmc", mcmc_samples = samp)
qqnorm(r3);abline(0, 1)
ks.test(r3, pnorm)

Here that gets us something similar and we would (correctly) fail to reject the hypothesis that $r_3 \sim \operatorname{N}(0, 1)$.

A similar issue applies to simulation-based quantile residuals, as implemented in the DHARMa package.

set.seed(1)
simulate(fit, nsim = 500, type = "mle-eb") |>
  dharma_residuals(fit)

Instead we can use a draw from the random effects 'posterior' assuming an MVN distribution.

set.seed(1)
simulate(fit, nsim = 500, type = "mle-mvn") |>
  dharma_residuals(fit)

And now that looks correct.

However, what happens if we were to sample the random effects with each simulation?

set.seed(1)
s <- replicate(200, simulate(fit, nsim = 1, type = "mle-mvn"), simplify = "matrix")
attr(s, "type") <- "mle-mvn"
dharma_residuals(s, fit)

We get back to something with the incorrect distribution for comparison! So, we need a single random effects sample per set of simulations.

How much this matters depends on the ratio of observation error variance vs. random effect variance

Lets simulate data again but with large observation error (phi here, which is the Gaussian error SD in this case) and with smaller levels of random field variance (sigma_E and sigma_O):

set.seed(123)
sim_dat2 <- sdmTMB_simulate(
  formula = ~ 1,
  data = predictor_dat,
  time = "year",
  mesh = mesh,
  family = gaussian(),
  range = 0.3,
  sigma_E = 0.1, # smaller than before
  sigma_O = 0.1, # smaller than before
  phi = 0.5, # bigger than before
  seed = 1,
  B = 0.2 
)
fit2 <- sdmTMB(observed ~ 1, data = sim_dat2, time = "year", mesh = mesh)
sanity(fit2)
set.seed(1)
r1 <- residuals(fit2, type = "mle-eb")
qqnorm(r1);abline(0, 1)
ks.test(r1, pnorm)
r2 <- residuals(fit2, type = "mle-mvn")
qqnorm(r2);abline(0, 1)
ks.test(r2, pnorm)

Now, it doesn't really matter since the 'incorrect' random effect distribution is swamped by the observation error effect on the distribution. Technically, the first set is 'wrong' and the second set is 'right', but functionally we'd come to a similar conclusion in this case.

Notes on uniform vs. normal quantile residuals

The randomized quantile residuals in residuals.sdmTMB() are returned such that they will be normal(0, 1) if the model is consistent with the data. DHARMa residuals, however, are returned as uniform(0, 1) under those same circumstances. Both are valid, and which to use is preference, but it's important to appreciate how this changes the appearance of the expected residuals.

r2 <- residuals(fit2, type = "mle-mvn")

Analytical normal(0, 1):

hist(r2)
qqnorm(r2)
abline(0, 1)
ks.test(r2, pnorm)

Analytical uniform(0, 1):

u <- pnorm(r2)
hist(u)
n <- length(u)
m <- seq_len(n) / (n + 1)
qqplot(m, u)
ks.test(u, punif)
abline(0, 1)

Simulation-based uniform(0, 1):

set.seed(1)
s <- simulate(fit2, nsim = 500, type = "mle-mvn") |>
  dharma_residuals(fit2, return_DHARMa = TRUE)
hist(s$scaledResiduals)
u <- s$scaledResiduals
m <- seq_len(length(u)) / (length(u)+ 1)
qqplot(m, u)
abline(0, 1)
ks.test(u, punif)

Simulation-based normal(0, 1):

set.seed(1)
s <- simulate(fit2, nsim = 1500, type = "mle-mvn") |>
  dharma_residuals(fit2, return_DHARMa = TRUE)
u <- s$scaledResiduals
r <- qnorm(u)
qqnorm(r)
abline(0, 1)
ks.test(u, punif)

Conclusions:

The normal(0, 1) residuals are probably more familiar to most people.

The normal(0, 1) residuals put more emphasis on the tails. This is good and bad: it's easier to examine tail behaviour, but these can often look 'off' even when the model is fine (as in this example) because observations in the tails of the distribution are by definition rarely observed.

Uniform(0, 1) residuals give all data points equal visual weight and emphasize consistency of the overall distribution rather than the tails.

Either is valid, but you do need to switch your mindset about what you expect to see accordingly. For example, poor tail behaviour may look like a minor issue with uniform residuals; conversely the tails of normal(0, 1) residuals are unlikely to ever look 'perfect' without large sample sizes.

How do those randomized-quantile residuals work?

Let's work through a simple example with the gamma distribution:

set.seed(1)
mu <- rep(2, 500)
phi <- 0.5
y <- rgamma(length(mu), shape = phi, scale = mu / phi)

To get quantile residuals, we transform those to uniform(0, 1) using pgamma() and then (optionally) convert those to normal(0, 1) with qnorm():

u <- pgamma(q = y, shape = phi, scale = mu / phi)
hist(u)
r <- qnorm(u)
hist(r)

This works for any distribution if we can define the quantile function.

If we have integer values, we need to add randomization in an additional step. Let's work with a Poisson sample:

set.seed(1)
mu <- rep(2, 500)
y <- rpois(length(mu), mu)

The gamma example above illustrated how we can use the distribution function (there pgamma()) to take a gamma-distributed variable and turn it into a uniform-distributed variable.

Now we need to do the same with the Poisson equivalent: ppois(). The ppois(y) function gives us the cumulative probability density up to value y. Say we have a Poisson variable with a mean (lambda) of 5 and we have an observation with a value of 3. We can calculate the density up to the value of 3 as:

lambda <- 5
ppois(3, lambda)

I.e., that is the same as:

dpois(0, lambda) + dpois(1, lambda) + dpois(2, lambda) + dpois(3, lambda)

But, if we naively apply ppois() to the observed values we'll end up with discrete cumulative probabilities that aren't very useful for comparing against the continuous uniform.

hist(ppois(y, mu))

Instead, we need to get the probability density up the value below the one we observed and "fill in" the values up to the observed value with the desired uniformly distributed samples. First, get the density up to the value below and the observed value:

a <- ppois(y - 1, mu)
hist(a)
b <- ppois(y, mu)
hist(b)

Then we can add randomization between these using runif() since our expectation is a uniform distribution at this stage to fill in the values between:

u <- runif(n = length(y), min = a, max = b)
hist(u)

Then we optionally apply qnorm():

r <- qnorm(u)
hist(r)

How do those simulation-based residuals work?

DHARMa uses simulation-based quantile residuals. As a result, we don't need to define the quantile function analytically. So, instead of a line like this:

u <- pgamma(q = y, shape = phi, scale = mu / phi)

We simulate from our model repeatedly, see where our observation falls within the simulated values, and get our quantile that way. For example, instead of this:

pnorm(2.2, mean = 0.5, sd = 1)

we could do this:

set.seed(1)
s <- rnorm(1e6, 0.5, 1) # equivalent of simulate()ing from our model
mean(s < 2.2) # equivalent of pnorm()

References

Dunn, P.K., and Smyth, G.K. 1996. Randomized Quantile Residuals. Journal of Computational and Graphical Statistics 5(3): 236–244.

Waagepetersen, R. 2006. A Simulation-Based Goodness-of-Fit Test for Random Effects in Generalized Linear Mixed Models. Scandinavian Journal of Statistics 33(4): 721–731.

Thygesen, U.H., Albertsen, C.M., Berg, C.W., Kristensen, K., and Nielsen, A. 2017. Validation of ecological state space models using the Laplace approximation. Environ Ecol Stat 24(2): 317–339.



pbs-assess/sdmTMB documentation built on May 17, 2024, 11:31 a.m.