Markov random field models are common for spatial data. Rather than specifying a joint distribution directly, a model is specified through a set of full conditional distributions for each spatial location. The R package conclique provides a way to simulate from a MRF using a Gibbs sampler based on concliques as well as implements a formal goodness-of-fit (GOF) test for model assessment. For reference, see [@kaiser2012goodness].
This vignette shows an example of how to simulate from a MRF using a conclique-based Gibbs sampler using conclique. This example is extended to looking at GOF tests in GOF statistics
Concliques are defined as sets of locations such that no location in the set is a neighbor
of any other location in the set. For example, the sets of all singular locations is a trivial set of concliques for any neighborhood structure. We can create the set of maximal concliques for a given lattice (the minimal conclique cover) using the conclique package and assign them to a grid for easy plotting.
library(conclique) N <- 6 grid <- matrix(1:(N*N), nrow = N) lattice <- lattice_4nn_torus(c(N, N)) concliques <- min_conclique_cover(lattice) grid <- assign_concliques(grid = grid, conclique_cover = concliques)
library(ggplot2) library(dplyr) library(scales) theme_blank <- function() { theme <- theme_bw() theme$line <- element_blank() theme$rect <- element_blank() theme$strip.text <- element_blank() theme$axis.text <- element_blank() theme$plot.title <- element_blank() theme$axis.title <- element_blank() return(theme) } ggplot2::theme_set(theme_blank()) set.seed(300) expand.grid(x = 1:N, y = 1:N) %>% rowwise() %>% mutate(conclique = grid[x, y]) %>% ggplot() + geom_text(aes(x = x, y = y, label = conclique))
We can use the conditional independence of the concliques to create a batch updating Gibbs sampler to simulate spatial values from this model. The conclique package comes equipped with all the functions necessary to simulate from a Gaussian MRF with a single dependence parameter, however the user can supply his own functions for simulation, see the section on extending conclique.
For this example, we will simulate from the Gaussian MRF with $\rho = 1, \kappa = 0, \eta = .24$ using both a conclique-based Gibbs sampler and a sequential Gibbs sampler to compare run time.
neighbors <- get_neighbors(lattice) params <- list(rho = 1, kappa = 0, eta = .24) inits <- matrix(0, nrow = N, ncol = N) num_sample <- 10000 ## conclique-gibbs time <- Sys.time() conclique_simulation <- run_conclique_gibbs(concliques, neighbors, inits, "gaussian_single_param", params, num_sample) conc_time <- Sys.time() - time ## sequential gibbs time <- Sys.time() sequential_simulation <- run_sequential_gibbs(neighbors, inits, "gaussian_single_param", params, num_sample) seq_time <- Sys.time() - time
In this example, the conclique-based gibbs sampler took r conc_time seconds and the sequential-based gibbs sampler took r seq_time seconds to simulate r comma(num_sample) spatial data sets of size $6 \times 6$.
concliqueObviously, simulating from a Gaussian MRF with 4-nearest neighbor structure can be accomplished with exact simulation. However, the conclique-based Gibbs method can be used for an arbitrary dependence structure and a user-defined conditional distribution.
The conclique package comes with a function to create a lattice of specified dimension with 4-nearest neighbor structure wrapped on a torus (lattice_4nn_torus). However, the user can create his own dependence structure by creating an igraph [@igraph] object with each location as a node and the dependence as the edges. The dimension of the lattice must be kept as an attribute for the object, this is accomplished using
igraph::set.graph.attribute(lattice, "dimvector", dimvec)
where dimvec is a vector storing the dimensions of the lattice, for example c(N, N).
Additionally, the user can specify a difference conditional distribution to sample the spatial data from. In order to accomplish this, the user must provide a function which takes as parameters
data A list containing two elements, sums and nums, which contain the sum of the data in each neighborhood as well as the number of locations in the neighborhood for each point in the conclique. params A named list of parameter values, that parameterize the distribution.This function must return a sampled dataset the same size as the original data.
For example, the sampler function for the single dependency parameter Gaussian model is reproduced below.
gaussian_single_param_sampler <- function(data, params) { rho <- params$rho kappa <- params$kappa eta <- params$eta mean_structure <- kappa + eta*(data$sums - data$nums*kappa) rnorm(length(mean_structure))*rho + mean_structure }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.