RunNimbleParallel: NIMBLE wrapper with parallel processing and automated...

View source: R/RunNimbleParallel.R

RunNimbleParallelR Documentation

NIMBLE wrapper with parallel processing and automated sampling until convergence

Description

Implements Bayesian model fitting using the NIMBLE R packaged in parallelized clusters across multiple processors. Additionally, convergence and sampling metrics (i.e., Rhat and n_effective) are checked periodically, and sampling resumes until these metrics meet specified minimum levels. If sampling resumes, saved samples are dumped to the hard drive to avoid overloading RAM with successive rounds of resuming sampling.

***Note: This function will probably not run in Windows because parallel processing packages do not operate easily in a Windows environment.

Usage

  RunNimbleParallel(model, inits, data, constants, parameters,
           par.ignore.Rht = c(), par.dontign.Rht = c(),
           par.fuzzy.track.Rht = c(), fuzzy.Rht.threshold = 0.05,
           nc = 2, ni = 2000, nb = 0.5, nt = 10, mod.nam = "mod",
           max.samples.saved = 10000, rtrn.model = F, sav.model = T,
           Rht.required = 1.1, neff.required = 100,
           check.progress = NULL, max.tries = NULL)

Arguments

model

Object containing NIMBLE model code generated by the function nimble::nimbleCode.

inits

Function that generates initial values for stochastic nodes as required to run a NIMBLE model.

data

List containing all data objects required to fit specified NIMBLE model.

constants

List containing all constants required to fit specified NIMBLE model.

parameters

String vector containing names of all parameters for which estimates are to be stored by the sampler.

par.ignore.Rht

String vector listing parameters to be ignored when calculating convergence and sampling metrics. All parameters with names containing strings listed here will be ignored. Default is an empty vector, whereby all parameters will be considered for calculating Rhat.

par.dontign.Rht

String vector listing parameters that should not be ignored when calculating convergence and sampling metrics. All parameters with names containing strings listed here will not be ignored. This parameter can be used in conjunction with 'par.ignore.Rht' to focus convergence checking on desired parameters. Default is an empty vector, whereby no parameters will be designated as not to be ignored for calculating Rhat.

par.fuzzy.track.Rht

String vector listing parameters to designate for fuzzy evaluation of convergence. If more than (fuzzy.Rht.threshold * number of parameters) Rhats > Rht.required, sampling will continue. Default is an empty vector, whereby no fuzzy evaluation of convergence will occur.

fuzzy.Rht.threshold

Threshold for fuzzy evaluation of convergence (only relevant if length(par.fuzzy.track.Rht) > 0).

nc

Number of MCMC chains to run. Note available cores must equal or exceed 'nc'. Default = 2.

ni

Number of iterations to run MCMC chains before checking convergence and sampling metrics. Default = 10000. If model does not meet specified levels for R_hat and n_effective, sampling will continue for another 'ni' steps before checking metrics again.

nb

If < 1, proportion of MCMC chains to discard as burn-in. If > 1, chain length before thinning to discard as burn-in.

nt

Base level of thinning to implement within parallelized cluster. Default = 10. Setting this argument will reduce load on RAM for large models.

mod.nam

Character string used as a file label for saved model outputs.

max.samples.saved

Integer that sets minimum level of samples to save per MCMC chain in final model object. Additional thinning of posterior samples (i.e., after samples are recovered from processing clusters) is implemented as necessary to limit the size of the model object to (roughly) this level. Default = 10000.

rtrn.model

Logical indicating whether to return model output to R environment. Default = F, i.e., model output is only saved to disk.

sav.model

Logical indicating whether to save model output to disk. Default = T, i.e., model output is saved to disk.

Rht.required

Maximum numeric value for Rhat for any parameter required to end sampling.

neff.required

Minimum value for n_effective for any parameter required to end sampling.

check.progress

The run number to check whether max Rhat < 2 and has improved since the initial run. If this criterion is not met, RunNimbleParallel will quit with the error "Insufficient progress towards convergence...." Default is to not check progress in this manner.

max.tries

The maximum number of runs, after which RunNimbleParallel will quit with the error "Maximum number of runs reached...." Default is to not set any constraint on MCMC chain length.

Value

If assigned to object and rtrn.model = T, model output as returned by mcmcOutput::mcmcOutput is returned. Else, model output is saved to file.

Author(s)

Quresh S. Latif, Bird Conservancy of the Rockies


qureshlatif/QSLpersonal documentation built on Sept. 12, 2023, 6:24 p.m.