# spMvGLM: Function for fitting multivariate Bayesian generalized linear... In spBayes: Univariate and Multivariate Spatial-Temporal Modeling

## Description

The function `spMvGLM` fits multivariate Bayesian generalized linear spatial regression models. Given a set of knots, `spMvGLM` will also fit a predictive process model (see references below).

## Usage

 ```1 2 3 4``` ```spMvGLM(formula, family="binomial", weights, data = parent.frame(), coords, knots, starting, tuning, priors, cov.model, amcmc, n.samples, verbose=TRUE, n.report=100, ...) ```

## Arguments

 `formula` a list of q symbolic regression model descriptions to be fit. See example below. `family` currently only supports `binomial` and `poisson` data using the logit and log link functions, respectively. `weights` an optional n x q matrix of weights to be used in the fitting process. The order of the columns correspond to the univariate models in the formula list. Weights correspond to number of trials and offset for each location for the `binomial` and `poisson` family, respectively. `data` an optional data frame containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which `spMvGLM` is called. `coords` an n x 2 matrix of the observation coordinates in R^2 (e.g., easting and northing). `knots` either a m x 2 matrix of the predictive process knot coordinates in R^2 (e.g., easting and northing) or a vector of length two or three with the first and second elements recording the number of columns and rows in the desired knot grid. The third, optional, element sets the offset of the outermost knots from the extent of the `coords`. `starting` a list with each tag corresponding to a parameter name. Valid tags are `beta`, `A`, `phi`, `nu`, and `w`. The value portion of each tag is a vector that holds the parameter's starting values and are of length p for `beta` (where p is the total number of regression coefficients in the multivariate model), q(q+1)/2 for `A`, and q for `phi`, and `nu`. Here, `A` holds the the lower-triangle elements in column major ordering of the Cholesky square root of the spatial cross-covariance matrix. If the predictive process is used then `w` must be of length qm; otherwise, it must be of length qn. Alternatively, `w` can be set as a scalar, in which case the value is repeated. `tuning` a list with tags `beta`, `A`, `phi`, `nu`, and `w`. The value portion of each tag defines the variance of the Metropolis sampler Normal proposal distribution. The value portion of these tags is of length p for `beta`, q(q+1)/2 for `A`, and q for `phi`, and `nu`. Here, `A` holds the tuning values corresponding to the lower-triangle elements in column major ordering of the Cholesky square root of the spatial cross-covariance matrix. If the predictive process is used then `w` must be of length qm; otherwise, it must be of length qn. Alternatively, `w` can be set as a scalar, in which case the value is repeated. The tuning value for `beta` can be a vector of length p or, if an adaptive MCMC is not used, i.e., `amcmc` is not specified, the lower-triangle of the pxp Cholesky square-root of the desired proposal covariance matrix. `priors` a list with each tag corresponding to a parameter name. Valid tags are `beta.flat`, `beta.norm`, `K.iw`, `phi.unif`, and `nu.unif`. If the regression coefficients are each assumed to follow a Normal distribution, i.e., `beta.norm`, then mean and variance hyperparameters are passed as the first and second list elements, respectively. If `beta` is assumed flat then no arguments are passed. The default is a flat prior. The spatial cross-covariance matrix `K` is assumed to follow an inverse-Wishart distribution, whereas the spatial decay `phi` and smoothness `nu` parameters are assumed to follow Uniform distributions. The hyperparameters of the inverse-Wishart are passed as a list of length two, with the first and second elements corresponding to the df and qxq scale matrix, respectively. The hyperparameters of the Uniform are also passed as a list of vectors with the first and second list elements corresponding to the lower and upper support, respectively. `cov.model` a quoted keyword that specifies the covariance function used to model the spatial dependence structure among the observations. Supported covariance model key words are: `"exponential"`, `"matern"`, `"spherical"`, and `"gaussian"`. See below for details. `amcmc` a list with tags `n.batch`, `batch.length`, and `accept.rate`. Specifying this argument invokes an adaptive MCMC sampler see Roberts and Rosenthal (2007) for an explanation. `n.samples` the number of MCMC iterations. This argument is ignored if `amcmc` is specified. `verbose` if `TRUE`, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen. `n.report` the interval to report Metropolis sampler acceptance and MCMC progress. `...` currently no additional arguments.

## Details

If a `binomial` model is specified the response vector is the number of successful trials at each location and `weights` is the total number of trials at each location.

For a `poisson` specification, the `weights` vector is the count offset, e.g., population, at each location. This differs from the `glm` `offset` argument which is passed as the log of this value.

A non-spatial model is fit when `coords` is not specified. See example below.

## Value

An object of class `spMvGLM`, which is a list with the following tags:

 `coords` the n x 2 matrix specified by `coords`. `knot.coords` the m x 2 matrix as specified by `knots`. `p.beta.theta.samples` a `coda` object of posterior samples for the defined parameters. `acceptance` the Metropolis sampler acceptance rate. If `amcmc` is used then this will be a matrix of each parameter's acceptance rate at the end of each batch. Otherwise, the sampler is a Metropolis with a joint proposal of all parameters. `acceptance.w` if this is a non-predictive process model and `amcmc` is used then this will be a matrix of the Metropolis sampler acceptance rate for each location's spatial random effect. `acceptance.w.knots` if this is a predictive process model and `amcmc` is used then this will be a matrix of the Metropolis sampler acceptance rate for each knot's spatial random effect. `p.w.knots.samples` a matrix that holds samples from the posterior distribution of the knots' spatial random effects. The rows of this matrix correspond to the q x m knot locations and the columns are the posterior samples. This is only returned if a predictive process model is used. `p.w.samples` a matrix that holds samples from the posterior distribution of the locations' spatial random effects. The rows of this matrix correspond to the q x n point observations and the columns are the posterior samples.

The return object might include additional data used for subsequent prediction and/or model fit evaluation.

## Author(s)

Andrew O. Finley [email protected],
Sudipto Banerjee [email protected]

## References

Finley, A.O., S. Banerjee, and R.E. McRoberts. (2008) A Bayesian approach to quantifying uncertainty in multi-source forest area estimates. Environmental and Ecological Statistics, 15:241–258.

Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.

Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873-2884.

Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. http://www.jstatsoft.org/v63/i13.

Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.

Roberts G.O. and Rosenthal J.S. (2006) Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf Preprint.

`spGLM`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118``` ```## Not run: library(MBA) ##Some useful functions rmvn <- function(n, mu=0, V = matrix(1)){ p <- length(mu) if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")} D <- chol(V) t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p))) } set.seed(1) ##Generate some data n <- 25 ##number of locations q <- 2 ##number of outcomes at each location nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix coords <- cbind(runif(n,0,1), runif(n,0,1)) ##Parameters for the bivariate spatial random effects theta <- rep(3/0.5,q) A <- matrix(0,q,q) A[lower.tri(A,TRUE)] <- c(1,-1,0.25) K <- A%*%t(A) Psi <- diag(0,q) C <- mkSpCov(coords, K, Psi, theta, cov.model="exponential") w <- rmvn(1, rep(0,nrow(C)), C) w.1 <- w[seq(1,length(w),q)] w.2 <- w[seq(2,length(w),q)] ##Covariate portion of the mean x.1 <- cbind(1, rnorm(n)) x.2 <- cbind(1, rnorm(n)) x <- mkMvX(list(x.1, x.2)) B.1 <- c(1,-1) B.2 <- c(-1,1) B <- c(B.1, B.2) weight <- 10 ##i.e., trials p <- 1/(1+exp(-(x%*%B+w))) y <- rbinom(n*q, size=rep(weight,n*q), prob=p) y.1 <- y[seq(1,length(y),q)] y.2 <- y[seq(2,length(y),q)] ##Call spMvLM fit <- glm((y/weight)~x-1, weights=rep(weight, n*q), family="binomial") beta.starting <- coefficients(fit) beta.tuning <- t(chol(vcov(fit))) A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)] n.batch <- 100 batch.length <- 50 n.samples <- n.batch*batch.length starting <- list("beta"=beta.starting, "phi"=rep(3/0.5,q), "A"=A.starting, "w"=0) tuning <- list("beta"=beta.tuning, "phi"=rep(1,q), "A"=rep(0.1,length(A.starting)), "w"=0.5) priors <- list("beta.Flat", "phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)), "K.IW"=list(q+1, diag(0.1,q))) m.1 <- spMvGLM(list(y.1~x.1-1, y.2~x.2-1), coords=coords, weights=matrix(weight,n,q), starting=starting, tuning=tuning, priors=priors, amcmc=list("n.batch"=n.batch,"batch.length"=batch.length,"accept.rate"=0.43), cov.model="exponential", n.report=25) burn.in <- 0.75*n.samples sub.samps <- burn.in:n.samples print(summary(window(m.1\$p.beta.theta.samples, start=burn.in))\$quantiles[,c(3,1,5)]) beta.hat <- t(m.1\$p.beta.theta.samples[sub.samps,1:length(B)]) w.hat <- m.1\$p.w.samples[,sub.samps] p.hat <- 1/(1+exp(-(x%*%beta.hat+w.hat))) y.hat <- apply(p.hat, 2, function(x){rbinom(n*q, size=rep(weight, n*q), prob=p)}) y.hat.mu <- apply(y.hat, 1, mean) ##Unstack to get each response variable fitted values y.hat.mu.1 <- y.hat.mu[seq(1,length(y.hat.mu),q)] y.hat.mu.2 <- y.hat.mu[seq(2,length(y.hat.mu),q)] ##Take a look par(mfrow=c(2,2)) surf <- mba.surf(cbind(coords,y.1),no.X=100, no.Y=100, extend=TRUE)\$xyz.est image(surf, main="Observed y.1 positive trials") contour(surf, add=TRUE) points(coords) zlim <- range(surf[["z"]], na.rm=TRUE) surf <- mba.surf(cbind(coords,y.hat.mu.1),no.X=100, no.Y=100, extend=TRUE)\$xyz.est image(surf, zlim=zlim, main="Fitted y.1 positive trials") contour(surf, add=TRUE) points(coords) surf <- mba.surf(cbind(coords,y.2),no.X=100, no.Y=100, extend=TRUE)\$xyz.est image(surf, main="Observed y.2 positive trials") contour(surf, add=TRUE) points(coords) zlim <- range(surf[["z"]], na.rm=TRUE) surf <- mba.surf(cbind(coords,y.hat.mu.2),no.X=100, no.Y=100, extend=TRUE)\$xyz.est image(surf, zlim=zlim, main="Fitted y.2 positive trials") contour(surf, add=TRUE) points(coords) ## End(Not run) ```