library(knitr) opts_chunk$set(cache=FALSE,comment=NA, fig.path="figs/", warning=FALSE, message=FALSE, optipng='-o7', pngquant='--speed=1 --quality=0-50') options(digits=5,show.signif.stars=FALSE,width=120) knit_hooks$set(optipng = hook_optipng) knit_hooks$set(pngquant = hook_pngquant)
See the introduction for an overview. Load the libraries:
library(ggplot2) library(INLA) library(brinla)
Plot the data which shows the nitrogen content of samples of reeds at three different sites. The specific sites themselves are not important but we are interested in how the response might vary between sites.
ggplot(reeds,aes(site,nitrogen))+geom_point()
We can fit a model with a fixed effect for the intercept and a random effect for the site using lme4
:
library(lme4) mmod <- lmer(nitrogen ~ 1+(1|site), reeds) summary(mmod)
These are REML rather than ML estimates to be precise
Now try INLA:
formula <- nitrogen ~ 1 + f(site, model="iid") imod <- inla(formula,family="gaussian", data = reeds)
We can get the fixed effect part of the output from:
imod$summary.fixed
which is quite similar to the REML result. The random effect summary can be extracted with:
bri.hyperpar.summary(imod)
We can plot the posterior densities of the random effects with:
bri.hyperpar.plot(imod)
or separately as:
bri.hyperpar.plot(imod, together=FALSE)
Mixed effects models are sometimes presented in a y=Xb+Zu+e form where X is the design matrix of
the fixed effects and Z is the design matrix of the random effects. The lme4
package makes it easy
to extract the X and Z:
Z = getME(mmod, "Z") X = getME(mmod, "X")
We can the use INLA to fit the model in this form. Note the -1
is because X already has
an intercept column.
n = nrow(reeds) formula = y ~ -1 + X + f(id.z, model="z", Z=Z) imodZ = inla(formula, data = list(y=reeds$nitrogen, id.z = 1:n, X=X))
We can get the fixed effects as:
imodZ$summary.fixed
and the random effects in terms of the SDs as
bri.hyperpar.summary(imodZ)
which is identical to the previous result. Although the two methods are equivalent here, we will probably find it easier to get the information we need from the first approach when we have more complex random effects.
In Simpson et al (2015), penalized complexity priors are proposed. This requires that we specify a scaling for the SDs of the random effects. We use the SD of the residuals of the fixed effects only model (what might be called the base model in the paper) to provide this scaling.
sdres <- sd(reeds$nitrogen) pcprior <- list(prec = list(prior="pc.prec", param = c(3*sdres,0.01))) formula <- nitrogen ~ f(site, model="iid", hyper = pcprior) imod <- inla(formula, family="gaussian", data=reeds)
Fixed effects
imod$summary.fixed
random effects
bri.hyperpar.summary(imod)
which can be plotted as:
bri.hyperpar.plot(imod)
This results in a larger posterior for the site SD. Hence the PC prior is putting more weight on larger SDs.
We can plot the random effect posterior densities for the three sites:
reff <- imod$marginals.random x <- seq(-1.5,1.5,len=100) d1 <- inla.dmarginal(x, reff$site[[1]]) d2 <- inla.dmarginal(x, reff$site[[2]]) d3 <- inla.dmarginal(x, reff$site[[3]]) rdf <- data.frame(bright=c(x,x,x),density=c(d1,d2,d3),site=gl(3,100,labels=LETTERS[1:3])) ggplot(rdf, aes(x=bright, y=density, color=site))+geom_line()
What is the probability that a sample from site B would exceed the nitrogen in a sample from site C? We can answer this question with a sample from the posterior:
imod <- inla(formula, family="gaussian", data=reeds, control.compute=list(config = TRUE)) psamp = inla.posterior.sample(n=1000, imod) lvsamp = t(sapply(psamp, function(x) x$latent)) colnames(lvsamp) = make.names(row.names(psamp[[1]]$latent)) plot(site.2 ~ site.3, lvsamp) mean(lvsamp[,'site.2'] > lvsamp[,'site.3'])
Although the site posteriors overlap in the figure, the positive correlation between the two variables means that the chance of B>C is negligible.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.