CAR_INLA: Fit a (scalable) spatial Poisson mixed model to areal count...
In bigDM: Scalable Bayesian Disease Mapping Models for High-Dimensional Data

CAR_INLA

R Documentation

Fit a (scalable) spatial Poisson mixed model to areal count data, where several CAR prior distributions can be specified for the spatial random effect.

Description

Fit a spatial Poisson mixed model to areal count data. The linear predictor is modelled as

\log{r_{i}}=\alpha+\mathbf{x_i}^{'}\mathbf{\beta} + \xi_i, \quad \mbox{for} \quad i=1,\ldots,n;

where \alpha is a global intercept, \mathbf{x_i}^{'}=(x_{i1},\ldots,x_{ip}) is a p-vector of standardized covariates in the i-th area, \mathbf{\beta}=(\beta_1,\ldots,\beta_p) is the p-vector of fixed effects coefficients, and \xi_i is a spatially structured random effect. Several conditional autoregressive (CAR) prior distributions can be specified for the spatial random effect, such as the intrinsic CAR prior \insertCitebesag1991bigDM, the convolution or BYM prior \insertCitebesag1991bigDM, the CAR prior proposed by \insertCiteleroux1999estimation;textualbigDM, and the reparameterization of the BYM model given by \insertCitedean2001detecting;textualbigDM named BYM2 \insertCiteriebler2016intuitivebigDM.

If covariates are included in the model, two different approaches can be used to address the potential confounding issues between the fixed effects and the spatial random effects of the model: restricted regression and the use of orthogonality constraints. At the moment, only continuous covariates can be included in the model as potential risk factors, which are automatically standardized before fitting the model. See \insertCiteadin2021alleviating;textualbigDM for further details.

Three main modelling approaches can be considered:

the usual model with a global spatial random effect whose dependence structure is based on the whole neighbourhood graph of the areal units (model="global" argument)
a Disjoint model based on a partition of the whole spatial domain where independent spatial CAR models are simultaneously fitted in each partition (model="partition" and k=0 arguments)
a modelling approach where k-order neighbours are added to each partition to avoid border effects in the Disjoint model (model="partition" and k>0 arguments).

For both the Disjoint and k-order neighbour models, parallel or distributed computation strategies can be performed to speed up computations by using the 'future' package \insertCitebengtsson2020unifyingbigDM.

Inference is conducted in a fully Bayesian setting using the integrated nested Laplace approximation (INLA; \insertCiterue2009approximate;textualbigDM) technique through the R-INLA package (https://www.r-inla.org/). For the scalable model proposals \insertCiteorozco2020bigDM, approximate values of the Deviance Information Criterion (DIC) and Watanabe-Akaike Information Criterion (WAIC) can also be computed.

The function allows also to use the new hybrid approximate method that combines the Laplace method with a low-rank Variational Bayes correction to the posterior mean \insertCitevanNiekerk2023bigDM by including the inla.mode="compact" argument.

Usage

CAR_INLA(
  carto = NULL,
  ID.area = NULL,
  ID.group = NULL,
  O = NULL,
  E = NULL,
  X = NULL,
  confounding = NULL,
  W = NULL,
  prior = "Leroux",
  model = "partition",
  k = 0,
  strategy = "simplified.laplace",
  scale.model = FALSE,
  PCpriors = FALSE,
  merge.strategy = "original",
  compute.intercept = NULL,
  compute.DIC = TRUE,
  n.sample = 1000,
  compute.fitted.values = FALSE,
  save.models = FALSE,
  plan = "sequential",
  workers = NULL,
  inla.mode = "classic",
  num.threads = NULL
)

Arguments

`carto`	object of class `SpatialPolygonsDataFrame` or `sf`. This object must contain at least the target variables of interest specified in the arguments `ID.area`, `O` and `E`.
`ID.area`	character; name of the variable that contains the IDs of spatial areal units.
`ID.group`	character; name of the variable that contains the IDs of the spatial partition (grouping variable). Only required if `model="partition"`.
`O`	character; name of the variable that contains the observed number of disease cases for each areal units.
`E`	character; name of the variable that contains either the expected number of disease cases or the population at risk for each areal unit.
`X`	a character vector containing the names of the covariates within the `carto` object to be included in the model as fixed effects, or a matrix object playing the role of the fixed effects design matrix. For the latter case, the row names must match with the IDs of the spatial units defined by the `ID.area` variable. If `X=NULL` (default), only a global intercept is included in the model as fixed effect.
`confounding`	one of either `NULL`, `"restricted"` (restricted regression) or `"constraints"` (orthogonal constraints), which specifies the estimation method used to alleviate spatial confounding between fixed and random effects. If only an intercept is considered in the model (`X=NULL`), the default value `confounding=NULL` will be set. At the moment, it only works for the Global model (specified through the `model="global"` argument).
`W`	optional argument with the binary adjacency matrix of the spatial areal units. If `NULL` (default), this object is computed from the `carto` argument (two areas are considered as neighbours if they share a common border).
`prior`	one of either `"Leroux"` (default), `"intrinsic"`, `"BYM"` or `"BYM2"`, which specifies the prior distribution considered for the spatial random effect.
`model`	one of either `"global"` or `"partition"` (default), which specifies the Global model or one of the scalable model proposal's (Disjoint model and k-order neighbourhood model, respectively).
`k`	numeric value with the neighbourhood order used for the partition model. Usually k=2 or 3 is enough to get good results. If k=0 (default) the Disjoint model is considered. Only required if `model="partition"`.
`strategy`	one of either `"gaussian"`, `"simplified.laplace"` (default), `"laplace"` or `"adaptive"`, which specifies the approximation strategy considered in the `inla` function.
`scale.model`	logical value (default `FALSE`); if `TRUE` then scale the models so their generalized variance is equal to 1. Note that `"BYM2"` model is always scaled.
`PCpriors`	logical value (default `FALSE`); if `TRUE` then penalised complexity (PC) priors are used for the precision parameter of the spatial random effect. It does not work for the `"Leroux"` model.
`merge.strategy`	one of either `"mixture"` or `"original"` (default), which specifies the merging strategy to compute posterior marginal estimates of the linear predictor. See `mergeINLA` for further details.
`compute.intercept`	CAUTION! This argument is deprecated from version 0.5.2.
`compute.DIC`	logical value; if `TRUE` (default) then approximate values of the Deviance Information Criterion (DIC) and Watanabe-Akaike Information Criterion (WAIC) are computed.
`n.sample`	numeric; number of samples to generate from the posterior marginal distribution of the linear predictor when computing approximate DIC/WAIC values. Default to 1000.
`compute.fitted.values`	logical value (default `FALSE`); if `TRUE` transforms the posterior marginal distribution of the linear predictor to the exponential scale (risks or rates).
`save.models`	logical value (default `FALSE`); if `TRUE` then a list with all the `inla` submodels is saved in '/temp/' folder, which can be used as input argument for the `mergeINLA` function.
`plan`	one of either `"sequential"` or `"cluster"`, which specifies the computation strategy used for model fitting using the 'future' package. If `plan="sequential"` (default) the models are fitted sequentially and in the current R session (local machine). If `plan="cluster"` the models are fitted in parallel on external R sessions (local machine) or distributed in remote computing nodes.
`workers`	character or vector (default `NULL`) containing the identifications of the local or remote workers where the models are going to be processed. Only required if `plan="cluster"`.
`inla.mode`	one of either `"classic"` (default) or `"compact"`, which specifies the approximation method used by INLA. See `help(inla)` for further details.
`num.threads`	maximum number of threads the inla-program will use. See `help(inla)` for further details.

Details

For a full model specification and further details see the vignettes accompanying this package.

Value

This function returns an object of class inla. See the mergeINLA function for details.

References

\insertRef

adin2021alleviatingbigDM

\insertRef

bengtsson2020unifyingbigDM

\insertRef

besag1991bigDM

\insertRef

dean2001detectingbigDM

\insertRef

leroux1999estimationbigDM

\insertRef

riebler2016intuitivebigDM

\insertRef

rue2009approximatebigDM

\insertRef

orozco2020bigDM

\insertRef

vanNiekerk2023bigDM

Examples

## Not run: 

if(require("INLA", quietly=TRUE)){

  ## Load the Spain colorectal cancer mortality data ##
  data(Carto_SpainMUN)

  ## Global model with a Leroux CAR prior distribution ##
  Global <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", O="obs", E="exp",
                     prior="Leroux", model="global", strategy="gaussian")

  summary(Global)

  ## Disjoint model with a Leroux CAR prior distribution  ##
  ## using 4 local clusters to fit the models in parallel ##
  Disjoint <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", ID.group="region", O="obs", E="exp",
                       prior="Leroux", model="partition", k=0, strategy="gaussian",
                       plan="cluster", workers=rep("localhost",4))
  summary(Disjoint)

  ## 1st-order neighbourhood model with a Leroux CAR prior distribution ##
  ## using 4 local clusters to fit the models in parallel               ##
  order1 <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", ID.group="region", O="obs", E="exp",
                     prior="Leroux", model="partition", k=1, strategy="gaussian",
                     plan="cluster", workers=rep("localhost",4))
  summary(order1)

  ## 2nd-order neighbourhood model with a Leroux CAR prior distribution ##
  ## using 4 local clusters to fit the models in parallel               ##
  order2 <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", ID.group="region", O="obs", E="exp",
                     prior="Leroux", model="partition", k=2, strategy="gaussian",
                     plan="cluster", workers=rep("localhost",4))
  summary(order2)
}

## End(Not run)

bigDM documentation built on April 3, 2025, 10:31 p.m.