variog.diagnostic.glgm: Variogram-based validation for generalized linear...

Description Usage Arguments Details Value

View source: R/foo.R

Description

This function performs model validation for generalized linear geostatistical models (Binomial and Poisson) using Monte Carlo methods based on the variogram.

Usage

1
2
3
4
5
6
7
variog.diagnostic.glgm(
  object,
  n.sim = 200,
  uvec = NULL,
  plot.results = TRUE,
  which.test = "both"
)

Arguments

object

an object of class "PrevMap" obtained as an output from binomial.logistic.MCML and poisson.log.MCML.

n.sim

integer indicating the number of simulations used for the variogram-based diagnostics. Defeault is n.sim=1000.

uvec

a vector with values used to define the variogram binning. If uvec=NULL, then uvec is then set to seq(MIN_DIST,(MAX_DIST-MIN_DIST)/2,length=15)

plot.results

if plot.results=TRUE, a plot is returned showing the results for the selected test(s) for spatial correlation. By default plot.results=TRUE. defined as the distance at which the fitted spatial correlation is no less than 0.05. Default is range.fact=1

which.test

a character specifying which test for residual spatial correlation is to be performed: "variogram", "test statistic" or "both". The default is which.test="both". See 'Details.'

Details

The function takes as an input through the argument object a fitted generalized linear geostaistical model for an outcome Y_i, with linear predictor

η_i=d_i'β+S(x_i)+Z_i

where d_i is a vector of covariates which are specified through formula, S(x_i) is a spatial Gaussian process and the Z_i are assumed to be zero-mean Gaussian. The model validation is performed on the adopted satationary and isotropic Matern covariance function used for S(x_i). More specifically, the function allows the users to select either of the following validation procedures.

Variogram-based graphical validation

This graphical diagnostic is performed by setting which.test="both" or which.test="variogram". The output are 95 (see below lower.lim and upper.lim) that are generated under the assumption that the fitted model did generate the analysed data-set. This validation procedure proceed through the following steps.

1. Obtain the mean, say \hat{Z}_i, of the Z_i conditioned on the data Y_i and by setting S(x_i)=0 in the equation above.

2. Compute the empirical variogram using \hat{Z}_i

3. Simulate n.sim data-sets under the fitted geostatistical model.

4. For each of the simulated data-sets and obtain \hat{Z}_i as in Step 1. Finally, compute the empirical variogram based on the resulting \hat{Z}_i.

5. From the n.sim variograms obtained in the previous step, compute the 95

If the observed variogram (obs.variogram below), based on the \hat{Z}_i from Step 2, falls within the 95 evidence against the fitted spatial correlation model; if, instead, that partly falls outside the 95 correlation in the data.

Test for suitability of the adopted correlation function

This diagnostic test is performed if which.test="both" or which.test="test statistic". Let v_{E}(B) and v_{T}(B) denote the empirical and theoretical variograms based on \hat{Z}_i for the distance bin B. The test statistic used for testing residual spatial correlation is

T = ∑_{B} N(B) \{v_{E}(B)-v_{T}(B)\}

where N(B) is the number of pairs of data-points falling within the distance bin B (n.bins below).

To obtain the distribution of the test statistic T under the null hypothesis that the fitted model did generate the analysed data-set, we use the simulated empirical variograms as obtained in step 5 of the iterative procedure described in "Variogram-based graphical validation." The p-value for the test of suitability of the fitted spatial correlation function is then computed by taking the proportion of simulated values for T that are larger than the value of T based on the original \hat{Z}_i in Step 1.

Value

An object of class "PrevMap.diagnostic" which is a list containing the following components:

obs.variogram: a vector of length length(uvec)-1 containing the values of the variogram for each of the distance bins defined through uvec.

distance.bins: a vector of length length(uvec)-1 containing the average distance within each of the distance bins defined through uvec.

n.bins: a vector of length length(uvec)-1 containing the number of pairs of data-points falling within each distance bin.

lower.lim: (available only if which.test="both" or which.test="variogram") a vector of length length(uvec)-1 containing the lower limits of the 95 generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec.

upper.lim: (available only if which.test="both" or which.test="variogram") a vector of length length(uvec)-1 containing the upper limits of the 95 generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec.

mode.rand.effects: the predictive mode of the random effects from the fitted non-spatial generalized linear mixed model.

p.value: (available only if which.test="both" or which.test="test statistic") p-value of the test for residual spatial correlation.

lse.variogram: (available only if lse.variogram=TRUE) a vector of length length(uvec)-1 containing the values of the estimated Matern variogram via a weighted least square fit.


PrevMap documentation built on Oct. 7, 2021, 5:07 p.m.