In julianfaraway/brinla: Bayesian Regression with INLA

library(knitr)
opts_chunk$set(cache=FALSE,comment=NA, fig.path="figs/", warning=FALSE, message=FALSE, optipng='-o7', pngquant='--speed=1 --quality=0-50')
options(digits=5,show.signif.stars=FALSE,width=120)
knit_hooks$set(optipng = hook_optipng)
knit_hooks$set(pngquant = hook_pngquant)

See the introduction for an overview. Load the libraries:

library(ggplot2)
library(INLA)
library(brinla)

Data

Plot the data which shows the nitrogen content of samples of reeds at three different sites. The specific sites themselves are not important but we are interested in how the response might vary between sites.

ggplot(reeds,aes(site,nitrogen))+geom_point()

Standard MLE analysis

We can fit a model with a fixed effect for the intercept and a random effect for the site using lme4:

library(lme4)
mmod <- lmer(nitrogen ~ 1+(1|site), reeds)
summary(mmod)

These are REML rather than ML estimates to be precise

Default INLA

Now try INLA:

formula <- nitrogen ~ 1 + f(site, model="iid")
imod  <-  inla(formula,family="gaussian", data = reeds)

We can get the fixed effect part of the output from:

imod$summary.fixed

which is quite similar to the REML result. The random effect summary can be extracted with:

bri.hyperpar.summary(imod)

We can plot the posterior densities of the random effects with:

bri.hyperpar.plot(imod)

or separately as:

bri.hyperpar.plot(imod, together=FALSE)

y=Xb+Zu+e model

Mixed effects models are sometimes presented in a y=Xb+Zu+e form where X is the design matrix of the fixed effects and Z is the design matrix of the random effects. The lme4 package makes it easy to extract the X and Z:

Z = getME(mmod, "Z")
X = getME(mmod, "X")

We can the use INLA to fit the model in this form. Note the -1 is because X already has an intercept column.

n = nrow(reeds)
formula = y ~ -1 + X +  f(id.z, model="z",  Z=Z)
imodZ = inla(formula, data = list(y=reeds$nitrogen, id.z = 1:n, X=X))

We can get the fixed effects as:

imodZ$summary.fixed

and the random effects in terms of the SDs as

bri.hyperpar.summary(imodZ)

which is identical to the previous result. Although the two methods are equivalent here, we will probably find it easier to get the information we need from the first approach when we have more complex random effects.

Penalized Complexity Prior

In Simpson et al (2015), penalized complexity priors are proposed. This requires that we specify a scaling for the SDs of the random effects. We use the SD of the residuals of the fixed effects only model (what might be called the base model in the paper) to provide this scaling.

sdres <- sd(reeds$nitrogen)
pcprior <- list(prec = list(prior="pc.prec", param = c(3*sdres,0.01)))
formula <- nitrogen ~ f(site, model="iid", hyper = pcprior)
imod <- inla(formula, family="gaussian", data=reeds)

Fixed effects

imod$summary.fixed

random effects

bri.hyperpar.summary(imod)

which can be plotted as:

bri.hyperpar.plot(imod)

This results in a larger posterior for the site SD. Hence the PC prior is putting more weight on larger SDs.

Random effects

We can plot the random effect posterior densities for the three sites:

reff <- imod$marginals.random
x <- seq(-1.5,1.5,len=100)
d1 <- inla.dmarginal(x, reff$site[[1]])
d2 <- inla.dmarginal(x, reff$site[[2]])
d3 <- inla.dmarginal(x, reff$site[[3]])
rdf <- data.frame(bright=c(x,x,x),density=c(d1,d2,d3),site=gl(3,100,labels=LETTERS[1:3]))
ggplot(rdf, aes(x=bright, y=density, color=site))+geom_line()

What is the probability that a sample from site B would exceed the nitrogen in a sample from site C? We can answer this question with a sample from the posterior:

imod <- inla(formula, family="gaussian", data=reeds, control.compute=list(config = TRUE))
psamp = inla.posterior.sample(n=1000, imod)
lvsamp = t(sapply(psamp, function(x) x$latent))
colnames(lvsamp) = make.names(row.names(psamp[[1]]$latent))
plot(site.2 ~ site.3, lvsamp)
mean(lvsamp[,'site.2'] > lvsamp[,'site.3'])