binomial.logistic.MCML: Monte Carlo Maximum Likelihood estimation for the binomial...

Description Usage Arguments Details Value Author(s) References See Also

View source: R/foo.R

Description

This function performs Monte Carlo maximum likelihood (MCML) estimation for the geostatistical binomial logistic model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
binomial.logistic.MCML(
  formula,
  units.m,
  coords,
  times = NULL,
  data,
  ID.coords = NULL,
  par0,
  control.mcmc,
  kappa,
  kappa.t = NULL,
  sst.model = NULL,
  fixed.rel.nugget = NULL,
  start.cov.pars,
  method = "BFGS",
  low.rank = FALSE,
  SPDE = FALSE,
  knots = NULL,
  mesh = NULL,
  messages = TRUE,
  plot.correlogram = TRUE
)

Arguments

formula

an object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted.

units.m

an object of class formula indicating the binomial denominators in the data.

coords

an object of class formula indicating the spatial coordinates in the data.

times

an object of class formula indicating the times in the data, used in the spatio-temporal model.

data

a data frame containing the variables in the model.

ID.coords

vector of ID values for the unique set of spatial coordinates obtained from create.ID.coords. These must be provided if, for example, spatial random effects are defined at household level but some of the covariates are at individual level. Warning: the household coordinates must all be distinct otherwise see jitterDupCoords. Default is NULL.

par0

parameters of the importance sampling distribution: these should be given in the following order c(beta,sigma2,phi,tau2), where beta are the regression coefficients, sigma2 is the variance of the Gaussian process, phi is the scale parameter of the spatial correlation and tau2 is the variance of the nugget effect (if included in the model).

control.mcmc

output from control.mcmc.MCML.

kappa

fixed value for the shape parameter of the Matern covariance function.

kappa.t

fixed value for the shape parameter of the Matern covariance function in the separable double-Matern spatio-temporal model.

sst.model

a character value that specifies the spatio-temporal correlation function.

  • sst.model="DM" separable double-Matern.

  • sst.model="GN1" separable correlation functions. Temporal correlation: f(x) = 1/(1+x/ψ); Spatial correaltion: Matern function.

Deafault is sst.model=NULL, which is used when a purely spatial model is fitted.

fixed.rel.nugget

fixed value for the relative variance of the nugget effect; fixed.rel.nugget=NULL if this should be included in the estimation. Default is fixed.rel.nugget=NULL.

start.cov.pars

a vector of length two with elements corresponding to the starting values of phi and the relative variance of the nugget effect nu2, respectively, that are used in the optimization algorithm. If nu2 is fixed through fixed.rel.nugget, then start.cov.pars represents the starting value for phi only.

method

method of optimization. If method="BFGS" then the maxBFGS function is used; otherwise method="nlminb" to use the nlminb function. Default is method="BFGS".

low.rank

logical; if low.rank=TRUE a low-rank approximation of the Gaussian spatial process is used when fitting the model. Default is low.rank=FALSE.

SPDE

logical; if SPDE=TRUE the SPDE approximation for the Gaussian spatial model is used. Default is SPDE=FALSE.

knots

if low.rank=TRUE, knots is a matrix of spatial knots that are used in the low-rank approximation. Default is knots=NULL.

mesh

an object obtained as result of a call to the function inla.mesh.2d.

messages

logical; if messages=TRUE then status messages are printed on the screen (or output device) while the function is running. Default is messages=TRUE.

plot.correlogram

logical; if plot.correlogram=TRUE the autocorrelation plot of the samples of the random effect is displayed after completion of conditional simulation. Default is plot.correlogram=TRUE.

Details

This function performs parameter estimation for a geostatistical binomial logistic model. Conditionally on a zero-mean stationary Gaussian process S(x) and mutually independent zero-mean Gaussian variables Z with variance tau2, the observations y are generated from a binomial distribution with probability p and binomial denominators units.m. A canonical logistic link is used, thus the linear predictor assumes the form

\log(p/(1-p)) = d'β + S(x) + Z,

where d is a vector of covariates with associated regression coefficients β. The Gaussian process S(x) has isotropic Matern covariance function (see matern) with variance sigma2, scale parameter phi and shape parameter kappa. In the binomial.logistic.MCML function, the shape parameter is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2, can also be fixed through the argument fixed.rel.nugget; if fixed.rel.nugget=NULL, then the relative variance of the nugget effect is also included in the estimation.

Monte Carlo Maximum likelihood. The Monte Carlo maximum likelihood method uses conditional simulation from the distribution of the random effect T(x) = d(x)'β+S(x)+Z given the data y, in order to approximate the high-dimensiional intractable integral given by the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization algorithm which uses analytic epression for computation of the gradient vector and Hessian matrix. The functions used for numerical optimization are maxBFGS (method="BFGS"), from the maxLik package, and nlminb (method="nlminb").

Using a two-level model to include household-level and individual-level information. When analysing data from household sruveys, some of the avilable information information might be at household-level (e.g. material of house, temperature) and some at individual-level (e.g. age, gender). In this case, the Gaussian spatial process S(x) and the nugget effect Z are defined at hosuehold-level in order to account for extra-binomial variation between and within households, respectively.

Low-rank approximation. In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process S(x) might be computationally beneficial. Let (x_{1},…,x_{m}) and (t_{1},…,t_{m}) denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then S(x) is approximated as ∑_{i=1}^m K(\|x-t_{i}\|; φ, κ)U_{i}, where U_{i} are zero-mean mutually independent Gaussian variables with variance sigma2 and K(.;φ, κ) is the isotropic Matern kernel (see matern.kernel). Since the resulting approximation is no longer a stationary process (but only approximately), the parameter sigma2 is then multiplied by a factor constant.sigma2 so as to obtain a value that is closer to the actual variance of S(x).

Value

An object of class "PrevMap". The function summary.PrevMap is used to print a summary of the fitted model. The object is a list with the following components:

estimate: estimates of the model parameters; use the function coef.PrevMap to obtain estimates of covariance parameters on the original scale.

covariance: covariance matrix of the MCML estimates.

log.lik: maximum value of the log-likelihood.

y: binomial observations.

units.m: binomial denominators.

D: matrix of covariates.

coords: matrix of the observed sampling locations.

method: method of optimization used.

ID.coords: set of ID values defined through the argument ID.coords.

kappa: fixed value of the shape parameter of the Matern function.

kappa.t: fixed value for the shape parameter of the Matern covariance function in the separable double-Matern spatio-temporal model.

knots: matrix of the spatial knots used in the low-rank approximation.

mesh: the mesh used in the SPDE approximation.

const.sigma2: adjustment factor for sigma2 in the low-rank approximation.

h: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm; see Laplace.sampling, or Laplace.sampling.lr if a low-rank approximation is used.

samples: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.

fixed.rel.nugget: fixed value for the relative variance of the nugget effect.

call: the matched call.

Author(s)

Emanuele Giorgi e.giorgi@lancaster.ac.uk

Peter J. Diggle p.diggle@lancaster.ac.uk

References

Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.

Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08

Christensen, O. F. (2004). Monte carlo maximum likelihood in model-based geostatistics. Journal of Computational and Graphical Statistics 13, 702-718.

Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.

See Also

Laplace.sampling, Laplace.sampling.lr, summary.PrevMap, coef.PrevMap, matern, matern.kernel, control.mcmc.MCML, create.ID.coords.


PrevMap documentation built on Oct. 7, 2021, 5:07 p.m.