get_cPredVar: Estimation of prediction variance with bootstrap correction
In spaMM: Mixed-Effect Models, with or without Spatial Random Effects

get_cPredVar

R Documentation

Estimation of prediction variance with bootstrap correction

Description

This function is similar to get_predVar except that is uses a bootstrap procedure to correct for bias in the evaluation of the prediction variance.

Usage

get_cPredVar(pred_object, newdata, nsim, seed, type = "residual", 
             variances=NULL, nb_cores = NULL, fit_env = NULL,
             sim_object=pred_object)

Arguments

`pred_object`	an object of class `HLfit`, as returned by the fitting functions in `spaMM`.
`newdata`	passed to `predict.HLfit` (it thus represents a prediction design, not to be confused with the bootstrap samples)
`nsim`	passed to `simulate.HLfit`
`seed`	passed to `simulate.HLfit`
`type`	passed to `simulate.HLfit`
`variances`	NULL or list; `variances["cov"]` will be passed to `predict.HLfit` to control whether a covariance matrix is computed or not. Other elements are currently ignored.
`nb_cores`	integer: number of cores to use for parallel computation of bootstrap. The default is `spaMM.getOption("nb_cores")`, and 1 if the latter is NULL. `nb_cores=1` prevents the use of parallelisation procedures.
`fit_env`	For parallel computations: an environment containing objects to be passed to the cores. They should have the same name in `fit_env` as in the environment they are passed from.
`sim_object`	an object of class `HLfit`, passed to `simulate.HLfit` as its `object` argument. Simulating from this object must produce response values that can be used as replacement to those of the original fitted `pred_object`. In standard usage, `sim_object=pred_object` (the default).

Details

The result provided by get_cPredVar is similar to the CMSEP (Conditional Mean Standard Error of Prediction) introduced by Booth and Hobert (1998; “B&H”). This paper is known for pointing the importance of using conditional variances when they differ from unconditional ones. This is hard to miss in spatial models, where the relevant prediction variance typically depends on the variance of random effects conditional on the data. Thus, the alternative function get_predVar already accounts for this and returns a prediction variance that depends on a joint covariance of fixed-effect estimates and of random effects given the data.

B&H also used a conditional bootstrap procedure to correct for some bias. get_cPredVar implements a similar procedure, in contrast to get_predVar. Their conditional bootstrap procedure is not applicable for autocorrelated random effects, and parametric bootstrapping of the residuals of the fitted model (as implied by the default value of argument type) is used instead here. Apart from this difference, the returned value includes exactly the same terms as those discussed by B&H: their “naive estimate” \nu_i and its bootstrap correction b_i, their correction \beta for uncertainty in fixed-effect coefficients, and their correction \sigma^2 for uncertainty in dispersion parameters.

This use of the bootstrap does not account for uncertainty in correlation parameters “outer-optimized” by fitme or corrHLfit, because the correlation parameters are fixed when the model is refitted on the bootstrap replicates. Even if it the correlation parameters were refitted, the full computation would not be sufficient to account for uncertainty in them. To account for uncertainty in correlation parameters, one should rather perform a parametric bootstrap of the full model (typically using spaMM_boot(., type="residual")), which may take much more time.

The “naive estimate” \nu_i is not generally an estimate of anything uniquely defined by the model parameters: for correlated random effects, it depends on the “root” of the correlation matrix of the random effects, which is not unique. Thus \nu_i is not unique, and may differ for example for equivalent fits by sparse-precision methods vs. other methods. Nevertheless, attr(cpredvar,"info")$naive does recover published values in the Examples below, as they involve no correlation matrix.

Value

A vector of prediction variances, with an attribute info which is an environment containing variables:

`SEs`	the standard errors of the estimates (which are those of the bootstrap replicates)
`bias`	the bias term
`maive`	B&H's “naive” `\nu_i`

References

Booth, J.G., Hobert, J.P. (1998) Standard errors of prediction in generalized linear mixed models. J. Am. Stat. Assoc. 93: 262-272.

Examples

## Not run: 
if(requireNamespace("rsae", quietly = TRUE)) {
  # LMM example from Booth & Hobert 1998 JASA
  data("landsat", package = "rsae")
  fitCorn <- fitme(HACorn ~ PixelsCorn + PixelsSoybeans + (1|CountyName),data=landsat[-33,])
  newXandZ <- unique(data.frame(PixelsCorn=landsat$MeanPixelsCorn,
                                PixelsSoybeans=landsat$MeanPixelsSoybeans,
                                CountyName=landsat$CountyName))
  (cpredvar <- get_cPredVar(fitCorn, newdata=newXandZ, nsim=200L, seed=123)) # serial computation
  (cpredvar <- get_cPredVar(fitCorn, newdata=newXandZ, nsim=200L, seed=123, 
        nb_cores=parallel::detectCores(logical=FALSE)-1L, 
        fit_env=list2env(list(newXandZ=newXandZ))))
}

# GLMM example from Booth & Hobert 1998 JASA
data(clinics)
fitClinics <- HLfit(cbind(npos,nneg)~treatment+(1|clinic),family=binomial(),data=clinics)
#
(get_cPredVar(fitClinics, newdata=clinics[1:8,], nsim=200L, seed=123))  # serial computation
(get_cPredVar(fitClinics, newdata=clinics[1:8,], nsim=200L, seed=123, 
      nb_cores=parallel::detectCores(logical=FALSE)-1L, 
      fit_env=list2env(list(clinics=clinics))))

## End(Not run)

spaMM documentation built on June 22, 2024, 9:48 a.m.