compute_SBC | R Documentation |
Performs the main SBC routine given datasets and a backend.
compute_SBC(
datasets,
backend,
cores_per_fit = default_cores_per_fit(length(datasets)),
keep_fits = TRUE,
thin_ranks = SBC_backend_default_thin_ranks(backend),
ensure_num_ranks_divisor = 2,
chunk_size = default_chunk_size(length(datasets)),
dquants = NULL,
cache_mode = "none",
cache_location = NULL,
globals = list(),
gen_quants = NULL
)
datasets |
an object of class |
backend |
the model + sampling algorithm. The built-in backends can be constructed
using |
cores_per_fit |
how many cores should the backend be allowed to use for a single fit?
Defaults to the maximum number that does not produce more parallel chains
than you have cores. See |
keep_fits |
boolean, when |
thin_ranks |
how much thinning should be applied to posterior draws before computing ranks for SBC. Should be large enough to avoid any noticeable autocorrelation of the thinned draws See details below. |
ensure_num_ranks_divisor |
Potentially drop some posterior samples to ensure that this number divides the total number of SBC ranks (see Details). |
chunk_size |
How many simulations within the |
dquants |
Derived quantities to include in SBC. Use |
cache_mode |
Type of caching of results, currently the only supported modes are
|
cache_location |
The filesystem location of cache. For |
globals |
A list of names of objects that are defined
in the global environment and need to present for the backend to work (
if they are not already available in package).
It is added to the |
gen_quants |
Deprecated, use dquants instead |
An object of class SBC_results()
.
Parallel processing is supported via the future
package, for most uses, it is most sensible
to just call plan(multisession)
once in your R session and all
cores your computer will be used. For more details refer to the documentation
of the future
package.
When using backends based on MCMC, there are two possible moments when
draws may need to be thinned. They can be thinned directly within the backend
and they may be thinned only to compute the ranks for SBC as specified by the
thin_ranks
argument. The main reason those are separate is that computing the
ranks requires no or negligible autocorrelation while some autocorrelation
may be easily tolerated for summarising the fit results or assessing convergence.
In fact, thinning too aggressively in the backend may lead to overly noisy
estimates of posterior means, quantiles and the posterior::rhat()
and
posterior::ess_tail()
diagnostics. So for well-adapted Hamiltonian Monte-Carlo
chains (e.g. Stan-based backends), we recommend no thinning in the backend and
even value of thin_ranks
between 6 and 10 is usually sufficient to remove
the residual autocorrelation. For a backend based on Metropolis-Hastings,
it might be sensible to thin quite aggressively already in the backend and
then have some additional thinning via thin_ranks
.
Backends that don't require thining should implement SBC_backend_iid_draws()
or SBC_backend_default_thin_ranks()
to avoid thinning by default.
Some of the visualizations and post processing steps
we use in the SBC package (e.g. plot_rank_hist()
, empirical_coverage()
)
work best if the total number of possible SBC ranks is a "nice" number
(lots of divisors).
However, the number of ranks is one plus the number of posterior samples
after thinning - therefore as long as the number of samples is a "nice"
number, the number of ranks usually will not be. To remedy this, you can
specify ensure_num_ranks_divisor
- the method will drop at most
ensure_num_ranks_divisor - 1
samples to make the number of ranks divisible
by ensure_num_ranks_divisor
. The default 2 prevents the most annoying
pathologies while discarding at most a single sample.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.