binom.krige.bayes: Bayesian Posterior Simulation and Prediction for the Binomial...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function performs posterior simulation (by MCMC) and spatial prediction in the binomial-logit spatial model.

Usage

1
2
3
binom.krige.bayes(geodata, coords = geodata$coords, data = geodata$data, 
         units.m = "default", locations = "no", borders,
         model, prior, mcmc.input, output)

Arguments

geodata

a list containing elements coords and data as described next. Typically an object of the class "geodata" - a geoR data set. If not provided the arguments coords and data must be given instead. The list may also contain an argument units.m as described below.

coords

an n x 2 matrix, each row containing Euclidean coordinates of the n data locations. By default it takes the element coords of the argument geodata.

data

a vector with data values. By default it takes the element data of the argument geodata.

units.m

n-dimensional vector giving the number of trials for the data. By default (units.m = "default"), it takes geodata$units.m in case this exist and else a vector of 1's.

locations

an N x 2 matrix or data frame, or a list with the 2-D coordinates of the N prediction locations.

borders

optional. If a two column matrix defining a polygon is provided the prediction is performed only at locations inside this polygon.

model

a list defining the components of the model. It can take an output from model.glm.control or a list with elements as for the arguments in model.glm.control. See documentation for model.glm.control. All arguments have default values

prior

specification of priors for the model parameters. It can take an output from prior.glm.control or a list with elements as for the arguments in prior.glm.control. See documentation for prior.glm.control. ATTENTION: When phi.prior = "fixed" then phi must be specified, and when phi.prior is not "fixed" then phi.discrete must be specified. All other parameters have default values.

mcmc.input

input parameter for the MCMC algorithm. It can take an output from mcmc.control or a list with elements as for the arguments in mcmc.control. See documentation for mcmc.control.
ATTENTION: the argument S.scale must be specified, the argument phi.start must specified when prior$phi is not "fixed", while all the others have default values.

output

parameters for controlling the output. It can take an output from output.glm.control or a list with elements as for the arguments in output.glm.control. See documentation for output.glm.control.

Details

binom.krige.bayes is a function for Bayesian geostatistical inference in the binomial-logit spatial model.

The Bayesian algorithm is using a discretized version of the prior distribution for the parameter phi. This means that the prior for phi is always a proper prior.

For simulating from the posterior distribution of S given y, we use a Langevin-Hastings type algorithm. This algorithm is a Metropolis-Hastings algorithm, where the proposal distribution uses gradient information from the posterior. The algorithm is described below. For shortness of presentation, we only present the MCMC algorithm for the case where beta follows a uniform prior.

When beta follows a uniform prior and the prior for sigma^2 is a scaled inverse-chi^2 distribution, the marginalised improper density of S is

f_I(s) propto |D^T*V^{-1}*D|^{-1/2}*|V|^{-1/2}*{n.sigma*S2.sigma + s^T*(V^{-1}-V^{-1}*D*(D^T*V^{-1}*D)^{-1}*D^T*V^{-1})*s }^{-(n-p+n.sigma)/2},

where V is the correlation matrix of S. The uniform prior for sigma^2 corresponds to S2.sigma=0 and n.sigma=-2, and the reciprocal prior for sigma^2 corresponds to S2.sigma=0 and n.sigma=0.

We use the reparametrisation S = Q*Gamma, where Q is the Cholesky factorisation of V so that V=QQ^T. Posterior simulations of S are obtained by transforming MCMC simulations from the conditional distribution of Gamma given Y=y.

The log posterior density of Gamma given Y=y is

log f(gamma|y) = const(y) - 1/2 * gamma^T*(I_n - V^{-1/2}*D*(D^T*V^{-1}*D)^{-1}*D^T*V^{-1/2})*gamma +∑_i y_i*s_i -log(1+exp(s_i)),

where (s_1,...,s_n)^T = Q*gamma.

For the Langevin-Hastings update we use the gradient of the log target density,

nabla(gamma )^{trunc}= - (I_n - Q^{-1}*D*(D^T*V^{-1}*D)^{-1}*D^T*(Q^{-1})^T)*gamma + Q^T*{y_i-\exp(s_i)/(1+exp(s_i))}_{i=1}^n .

The proposal gamma' follows a multivariate normal distribution with mean vector xi(gamma)=gamma+(h/2)*nabla(gamma)^{trunc} and covariance matrix hI, where h>0 is a user-specified “proposal variance” (called S.scale; see mcmc.control).

When phi.prior is not "fixed", we update the parameter phi by a random walk Metropolis step. Here mcmc.input$phi.scale (see mcmc.control) is the proposal variance, which needs to be sufficiently large so that the algorithm easily can move between the discrete values in prior$phi.discrete (see prior.glm.control).

CONTROL FUNCTIONS

The function call includes auxiliary control functions which allows the user to specify and/or change the specification of 1) model components (using model.glm.control), 2) prior distributions (using prior.glm.control), 3) options for the MCMC algorithm (using mcmc.control), and 4) options for the output (using output.glm.control). Default values are available in most of the cases. The arguments for the control functions are described in their respective help files.

In the prediction part of the function we want to predict exp(S^*)/(1+exp(S^*)) at locations of interest. For the prediction part of the algorithm, we use the median of the predictive distribution as the predictor and 1/4 of the length of the 95 percent predictive interval as a measure of the prediction uncertainty. Below we describe the procedure for calculating these quantities.

First we perform a Bayesian Gaussian prediction with the given priors on beta and sigma^2 on each of the simulated S-“datasets” from the posterior distribution (and in case phi is not fixed, for each sampled phi value). This Gaussian prediction is done by an internal function which is an extension of krige.bayes allowing for more than one “data set”.

For calculating the probability below a threshold for the predictive distribution given the data, we first calculate this probability for each of the simulated S-“datasets”. This is done using the fact that the predictive distribution for each of the simulated S-“datasets” is a multivariate t-distribution. Afterwards the probability below a threshold is calculated by taking the empirical mean of these conditional probabilities.

Now the median and the 2.5 percent and 97.5 percent quantiles can be calculated by an iterative procedure, where first a guess of the value is made, and second this guess is checked by calculating the probability of being less than this value. In case the guess is too low, it is adjusted upwards, and vise verse.

Value

A list with the following components:

posterior

A list with results for the posterior distribution of the model parameters and the random effects at the data locations. The components are:

  • betasummary of posterior distribution for the parameter beta.

  • sigmasqsummary of the posterior distribution for the parameter sigma^2.

  • phisummary of the posterior distribution of the parameter phi.

  • simulationssample from the posterior distribution of exp(S)/(1+exp(S)) at the data locations. Returned only if keep.mcmc.sim = TRUE.

  • acc.rateThe acceptance rates.

predictive

A list with results for the predictive distribution at the prediction locations (if provided). The components are:

  • simulationsa numerical matrix. Each column contains a simulation from the predictive distribution. Returned only if sim.predict = TRUE.

  • mediana vector with the estimated median at the prediction locations.

  • uncertaintya vector with the estimated uncertainty at the prediction locations, defined as the length of the 95\% prediction interval divided by 4.

  • quantileA matrix or vector with quantile estimators.

  • probabilityA matrix or vector with probabilities below a threshold. Returned only if the argument threshold is used.

model

model components used as defined by model.glm.control.

prior

priors used for the model parameters.

mcmc.input

input parameters used for the MCMC algorithm.

.Random.seed

system random seed before running the function. Allows reproduction of results. If the .Random.seed is set to this value and the function is run again, it will produce exactly the same results.

call

the function call.

Author(s)

Ole F. Christensen OleF.Christensen@agrsci.dk,
Paulo J. Ribeiro Jr. Paulo.Ribeiro@est.ufpr.br.

References

Further information about geoRglm can be found at:
http://gbi.agrsci.dk/~ofch/geoRglm.

See Also

binom.krige for prediction with fixed parameters in the binomial-logit normal model, pois.krige.bayes for Bayesian prediction in the Poisson normal model, krige.bayes for Bayesian prediction in the Gaussian spatial model.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(b50)

if(!exists(".Random.seed", envir=.GlobalEnv, inherits = FALSE)) set.seed(1234)
## Not run: 
mcmc.10 <- mcmc.control(S.scale = 0.09, n.iter = 1000, phi.scale = 0.2, 
              phi.start = 4.5)
prior.10 <- prior.glm.control(phi.discrete = seq(0.2,5,0.2))
test.10 <- binom.krige.bayes(b50, locations=t(cbind(c(2.5,3.5),c(-1,3.5),c(2.5,1.5),c(-1,1.5))),
              prior = prior.10, mcmc.input = mcmc.10)
image(test.10)
persp(test.10)

## End(Not run)

geoRglm documentation built on May 2, 2019, 4:03 p.m.