stan_esf | R Documentation |
Fit a spatial regression model using eigenvector spatial filtering (ESF).
stan_esf(
formula,
slx,
re,
data,
C,
EV = make_EV(C, nsa = nsa, threshold = threshold),
nsa = FALSE,
threshold = 0.25,
family = gaussian(),
prior = NULL,
ME = NULL,
centerx = FALSE,
censor_point,
prior_only = FALSE,
chains = 4,
iter = 2000,
refresh = 500,
keep_all = FALSE,
slim = FALSE,
drop = NULL,
pars = NULL,
control = NULL,
quiet = FALSE,
...
)
formula |
A model formula, following the R |
slx |
Formula to specify any spatially-lagged covariates. As in, |
re |
To include a varying intercept (or "random effects") term, alpha_re ~ N(0, alpha_tau) alpha_tau ~ Student_t(d.f., location, scale). |
data |
A |
C |
Spatial connectivity matrix. This will be used to calculate eigenvectors if |
EV |
A matrix of eigenvectors from any (transformed) connectivity matrix, presumably spatial or network-based (see |
nsa |
Include eigenvectors representing negative spatial autocorrelation? Defaults to |
threshold |
Eigenvectors with standardized Moran coefficient values below this |
family |
The likelihood function for the outcome variable. Current options are |
prior |
A named list of parameters for prior distributions (see
.
|
ME |
To model observational uncertainty (i.e. measurement or sampling error) in any or all of the covariates, provide a list of data as constructed by the |
centerx |
To center predictors on their mean values, use |
censor_point |
Integer value indicating the maximum censored value; this argument is for modeling censored (suppressed) outcome data, typically disease case counts or deaths. For example, the US Centers for Disease Control and Prevention censors (does not report) death counts that are nine or fewer, so if you're using CDC WONDER mortality data you could provide |
prior_only |
Draw samples from the prior distributions of parameters only. |
chains |
Number of MCMC chains to estimate. Default |
iter |
Number of samples per chain. Default |
refresh |
Stan will print the progress of the sampler every |
keep_all |
If |
slim |
If |
drop |
Provide a vector of character strings to specify the names of any parameters that you do not want MCMC samples for. Dropping parameters in this way can improve sampling speed and reduce memory usage. The following parameter vectors can potentially be dropped from ESF models:
If |
pars |
Optional; specify any additional parameters you'd like stored from the Stan model. |
control |
A named list of parameters to control the sampler's behavior. See stan for details. |
quiet |
By default, any prior distributions that have not been assigned by the user are printed to the console. If |
... |
Other arguments passed to sampling. |
Eigenvector spatial filtering (ESF) is a method for spatial regression analysis. ESF is extensively covered in Griffith et al. (2019). This function implements the methodology introduced in Donegan et al. (2020), which uses Piironen and Vehtari's (2017) regularized horseshoe prior.
By adding a spatial filter to a regression model, spatial autocorrelation patterns are shifted from the residuals to the spatial filter. ESF models take the spectral decomposition of a transformed spatial connectivity matrix, C
. The resulting eigenvectors, E
, are mutually orthogonal and uncorrelated map patterns (at various scales, 'local' to 'regional' to 'global'). The spatial filter equals E \beta_{E}
where \beta_{E}
is a vector of coefficients.
ESF decomposes the data into a global mean, \alpha
, global patterns contributed by covariates X \beta
, spatial trends E \beta_{E}
, and residual variation. Thus, for family=gaussian()
,
y \sim Gauss(\alpha + X * \beta + E \beta_{E}, \sigma).
An ESF component can be incorporated into the linear predictor of any generalized linear model. For example, using stan_esf
with family = poisson()
and adding a 'random effects' term for each spatial unit (via the re
argument) will produce a model that resembles the BYM model (combining spatially structured and spatially-unstructured components).
The spatial.geostan_fit method will return E \beta_{E}
.
The model can also be extended to the space-time domain; see shape2mat to specify a space-time connectivity matrix.
The coefficients \beta_{E}
are assigned the regularized horseshoe prior (Piironen and Vehtari, 2017), resulting in a relatively sparse model specification. In addition, numerous eigenvectors are automatically dropped because they represent trace amounts of spatial autocorrelation (this is controlled by the threshold
argument). By default, stan_esf
will drop all eigenvectors representing negative spatial autocorrelation patterns. You can change this behavior using the nsa
argument.
The ESF models can also incorporate spatially-lagged covariates, measurement/sampling error in covariates (particularly when using small area survey estimates as covariates), missing outcome data, and censored outcomes (such as arise when a disease surveillance system suppresses data for privacy reasons). For details on these options, please see the Details section in the documentation for stan_glm.
An object of class class geostan_fit
(a list) containing:
Summaries of the main parameters of interest; a data frame
Residual spatial autocorrelation as measured by the Moran coefficient.
a data frame containing the model data
A matrix of eigenvectors created with w
and geostan::make_EV
The spatial weights matrix used to construct EV
the user-provided or default family
argument used to fit the model
The model formula provided by the user (not including ESF component)
The slx
formula
A list containing re
, the random effects (varying intercepts) formula if provided, and
data
a data frame with columns id
, the grouping variable, and idx
, the index values assigned to each group.
Prior specifications.
If covariates are centered internally (centerx = TRUE
), then x_center
is a numeric vector of the values on which covariates were centered.
The ME
data list, if one was provided by the user for measurement error models.
A data frame with the name of the spatial component parameter ("esf") and method ("ESF")
an object of class stanfit
returned by rstan::stan
Connor Donegan, connor.donegan@gmail.com
Chun, Y., D. A. Griffith, M. Lee and P. Sinha (2016). Eigenvector selection with stepwise regression techniques to construct eigenvector spatial filters. Journal of Geographical Systems, 18(1), 67-85. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s10109-015-0225-3")}.
Dray, S., P. Legendre & P. R. Peres-Neto (2006). Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM). Ecological Modeling, 196(3-4), 483-493.
Donegan, C., Y. Chun and A. E. Hughes (2020). Bayesian estimation of spatial filters with Moran’s Eigenvectors and hierarchical shrinkage priors. Spatial Statistics. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.spasta.2020.100450")} (open access: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.31219/osf.io/fah3z")}).
Griffith, Daniel A., and P. R. Peres-Neto (2006). Spatial modeling in ecology: the flexibility of eigenfunction spatial analyses. Ecology 87(10), 2603-2613.
Griffith, D., and Y. Chun (2014). Spatial autocorrelation and spatial filtering, Handbook of Regional Science. Fischer, MM and Nijkamp, P. eds.
Griffith, D., Chun, Y. and Li, B. (2019). Spatial Regression Analysis Using Eigenvector Spatial Filtering. Elsevier.
Piironen, J and A. Vehtari (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. In Electronic Journal of Statistics, 11(2):5018-5051.
data(sentencing)
# spatial weights matrix with binary coding scheme
C <- shape2mat(sentencing, style = "B", quiet = TRUE)
# expected number of sentences
log_e <- log(sentencing$expected_sents)
# fit spatial Poisson model with ESF + unstructured 'random effects'
fit.esf <- stan_esf(sents ~ offset(log_e),
re = ~ name,
family = poisson(),
data = sentencing,
C = C,
chains = 2, iter = 800) # for speed only
# spatial diagnostics
sp_diag(fit.esf, sentencing)
# plot marginal posterior distributions of beta_ev (eigenvector coefficients)
plot(fit.esf, pars = "beta_ev")
# calculate log-standardized incidence ratios (SIR)
# # SIR = observed/exected number of cases
# in this case, prison sentences
library(ggplot2)
library(sf)
f <- fitted(fit.esf, rates = FALSE)$mean
SSR <- f / sentencing$expected_sents
log.SSR <- log( SSR, base = 2 )
# map the log-SSRs
ggplot(sentencing) +
geom_sf(aes(fill = log.SSR)) +
scale_fill_gradient2(
midpoint = 0,
name = NULL,
breaks = seq(-3, 3, by = 0.5)
) +
labs(title = "Log-Standardized Sentencing Ratios",
subtitle = "log( Fitted/Expected ), base 2"
) +
theme_void()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.