spaMM_boot: Parametric bootstrap
In spaMM: Mixed-Effect Models, with or without Spatial Random Effects

spaMM_boot

R Documentation

Parametric bootstrap

Description

spaMM_boot simulates samples from a fit object inheriting from class "HLfit", as produced by spaMM's fitting functions, and applies a given function to each simulated sample. Parallelization is supported (see Details).

spaMM2boot is similar except that it assumes that the original model is refitted on the simulated data, and the given function is applied to the refitted model, and the value is in a format directly usable as input for boot::boot.ci.

Both of these functions can be used to apply standard parametric bootstrap procedures. spaMM_boot is suitable for more diverse applications, e.g. to fit by one model some samples simulated under another model (see Example).

Usage

spaMM_boot(object, simuland, nsim, nb_cores=NULL, seed=NULL,
           resp_testfn=NULL, control.foreach=list(),
           debug. = FALSE, type, fit_env=NULL, cluster_args=NULL,
           showpbar= eval(spaMM.getOption("barstyle")),
           boot_samples=NULL,
           ...)
spaMM2boot(object, statFUN, nsim, nb_cores=NULL, seed=NULL,
           resp_testfn=NULL, control.foreach=list(),
           debug. = FALSE, type="marginal", fit_env=NULL, 
           cluster_args=NULL, showpbar= eval(spaMM.getOption("barstyle")),
           boot_samples=NULL,
           ...)

Arguments

`object`	The fit object to simulate from.
`simuland`	The function to apply to each simulated sample. See Details for requirements of this function.
`statFUN`	The function to apply to each fit object for each simulated sample. See Details for requirements of this function.
`nsim`	Number of samples to simulate and analyze.
`nb_cores`	Number of cores to use for parallel computation. The default is `spaMM.getOption("nb_cores")`, and 1 if the latter is NULL. `nb_cores=1` prevents the use of parallelisation procedures.
`seed`	Passed to `simulate.HLfit`
`resp_testfn`	Passed to `simulate.HLfit`; NULL, or a function that tests a condition which simulated samples should satisfy. This function takes a response vector as argument and return a boolean (TRUE indicating that the sample satisfies the condition).
`control.foreach`	list of control arguments for `foreach`. These include in particular `.combine` (with default value `"rbind"`), and `.errorhandling` (with default value `"remove"`, but `"pass"` is quite useful for debugging).
`debug.`	Boolean (or integer, interpreted as boolean). For debugging purposes, given that `spaMM_boot` does not stop when the fit of a bootstrap replicate fails. Subject to changes with no or little notice. In serial computation, `debug.=2` will stop on an error. In parallel computation, this would be ignored. The effect of `debug.=TRUE` depends on what `simuland` does of it. The default `simuland` for likelihood ratio testing functions, `eval_replicate`, shows how `debug.` can be used to control a call to `dump.frames` (however, debugging user-defined functions by such a call does not require control by `debug.`).
`type`	Character: passed to `simulate.HLfit`. Defaults, with a warning, to `type="marginal"` in order to replicate the behaviour of previous versions of `spaMM_boot`. This is an appropriate default for various parametric bootstrpa analyses, but not necessarily the appropriate `type` for all possible uses. See Details of `simulate.HLfit` for other implemented options.
`fit_env`	An environment or list containing variables necessary to evaluate `simuland` on each sample, and not included in the fit `object`. E.g., use `fit_env=list(phi_fix=phi_fix)` if the fit assumed `fixed=list(phi=phi_fix)`: the name in `list(phi_fix=<.>)` must be the name of the object that will be sought by the called process when interpreting `fixed=list(phi=phi_fix)` (if still unsure about the proper syntax, see the `clusterExport` documentation, as `fit_env` is used in the following context: `parallel::clusterExport(cl=<cluster>, varlist=ls(fit_env), envir=fit_env))`.
`cluster_args`	`NULL` or a `list` of arguments, passed to `makeCluster`.
`showpbar`	Controls display of progress bar. See `barstyle` option for details.
`boot_samples`	NULL, or precomputed bootstrap samples from the fitted model, provided as a matrix with one column per bootstrap replicate (the format of the result of `simulate.HLfit`), or as a list including a `bootreps` element with the same matrix format.
`...`	Further arguments passed to the `simuland` function.

Details

The simuland function must take as first argument a vector of response values, and may have other arguments including ‘...’. When required, these additional arguments must be passed through the ‘...’ arguments of spaMM_boot. Variables needed to evaluate them must be available from within the simuland function or otherwise provided as elements of fit_env.

The statFUN function must take as first argument (named refit) a fit object, and may have other arguments including ‘...’ handled as for simuland.

spaMM_boot handles parallel backends with different features. pbapply::pbapply has a very simple interface (essentially equivalent to apply) and provides progress bars, but (in version 1.4.0, at least) does not have efficient load-balancing. doSNOW also provides a progress bar and allows more efficient load-balancing, but its requires foreach. foreach handles errors differently from pbapply (which will simply stop if fitting a model to a bootstrap replicate fails): see the foreach documentation.

spaMM_boot calls simulate.HLfit on the fit object and applies simuland on each column of the matrix returned by this call. simulate.HLfit uses the type argument, which must be explicitly provided.

Value

spaMM_boot returns a list, with the following element(s) (unless debug. is TRUE):

bootreps: nsim return values in the format returned either by apply or parallel::parApply or by foreach::`%dopar%` as controlled by control.foreach$.combine (which is here "rbind" by default).
RNGstate: (absent in the case the boot_samples argument was used to provide the new response values but not the RNGstate) the state of .Random.seed at the beginning of the sample simulation

spaMM2boot returns a list suitable for use by boot.ci, with elements:

t: nsim return values of the simulated statistic (in matrix format).
t0: nsim return the value of statFUN from the original fit.
sim: The simulation type ("parametric").
R: nsim
.Random.seed: the state of .Random.seed at the beginning of the sample simulation

. (other elements of an object of class boot are currently not included.)

Examples

if (spaMM.getOption("example_maxtime")>7) {
 data("blackcap")
 
 # Generate fits of null and full models:
 lrt <- fixedLRT(null.formula=migStatus ~ 1 + Matern(1|longitude+latitude),
                 formula=migStatus ~ means + Matern(1|longitude+latitude), 
                 method='ML',data=blackcap)

 # The 'simuland' argument: 
 myfun <- function(y, what=NULL, lrt, ...) { 
    data <- lrt$fullfit$data
    data$migStatus <- y ## replaces original response (! more complicated for binomial fits)
    full_call <- getCall(lrt$fullfit) ## call for full fit
    full_call$data <- data
    res <- eval(full_call) ## fits the full model on the simulated response
    if (!is.null(what)) res <- eval(what)(res=res) ## post-process the fit
    return(res) ## the fit, or anything produced by evaluating 'what'
  }
  # where the 'what' argument (not required) of myfun() allows one to control 
  # what the function returns without redefining the function.
  
  # Call myfun() with no 'what' argument: returns a list of fits 
  spaMM_boot(lrt$nullfit, simuland = myfun, nsim=1, lrt=lrt, 
             type ="marginal")[["bootreps"]] 
  
  # Return only a model coefficient for each fit: 
  spaMM_boot(lrt$nullfit, simuland = myfun, nsim=7,
             what=quote(function(res) fixef(res)[2L]), 
             lrt=lrt, type ="marginal")[["bootreps"]]       
  
  ## Not run: 
    # Parametric bootstrap by spaMM2boot() and spaMM_boot():
    boot.ci_info <- spaMM2boot(lrt$nullfit, statFUN = function(refit) fixef(refit)[1], 
                               nsim=99, type ="marginal")
    boot::boot.ci(boot.ci_info, , type=c("basic","perc","norm"))
    
    nullfit <- lrt$nullfit
    boot_t <- spaMM_boot(lrt$nullfit, simuland = function(y, nullfit) {
      refit <- update_resp(nullfit, y)
      fixef(refit)[1]
    }, nsim=99, type ="marginal", nullfit=nullfit)$bootreps
    boot::boot.ci(list(R = length(boot_t), sim="parametric"), t0=fixef(nullfit)[1], 
                  t= t(boot_t), type=c("basic","perc","norm"))


  
## End(Not run)           
}

spaMM documentation built on June 22, 2024, 9:48 a.m.