%diagnostics% | R Documentation |
This operator extracts diagnostic summaries of the posterior distributions for particular
parameters from an evorates_fit
object or param_block
array. This can be used to
check that a posterior distribution was sampled thoroughly.
fit %diagnostics% select
fit %d% select
fit |
An object of class " |
select |
A list with two elements (2nd element is optional):
|
In the case that a numeric vector is provided to select parameters and no parameters
with names following the pattern "R_i
" are found, the function then looks for the pattern
"Rdev_i
", then "uncent_Rdev_i
". If neither of these are found, then it finally
defaults to selecting the i
th parameters. If a single parameter name involves multiple
"R_i
" patterns, the first pattern is always used (e.g., R_1%-%R_2
would correspond
to 1
). Similarly, if multiple parameters match to a single number, the first parameter is
always used (similar to behavior of match).
The sampler.params
element of fit
always includes 9 parameters, named:
"accept_stat__
", "treedepth__
", "stepsize__
", "divergent__
",
"n_leapfrog__
", "energy__
", "prior__
", "lik__
", and "post__
".
This tends to include both warmup and non-warmup samples, but warmup samples are
automatically excluded if both sampler parameters and normal parameters are selected by this
function. Most of these parameters are rather technical quantities used to tune the
Hamiltonian Monte Carlo sampler run by Stan (see Stan manual for further details on what they
mean). Generally, users will only want to look at the last 3 parameters, which give the
(log) prior probability, likelihood, and posterior probability, respectively, of sampled
parameters. Note that these are on the sampling scale and will differ from those for the
originally-scaled data by a constant. Also, the posterior probability will be affected
by what lik.power
was set to for fitting the model. In some cases, users may also
wish to look at what parameter values are associated with divergent transitions (i.e.,
iterations where divergent__ = 1
), which indicate regions of parameter space where
the sampler got "stuck", yielding potentially misleading posterior distribution estimates.
Initial values are simply the starting values for each parameter. If fit is a
chains param_block
array, the initial values will always be NA
since it
is generally unknown if a chains param_block
array contains all warmup
iterations (and they generally do not, in any case). Bulk and tail effective sample
sizes (ESS) measure how well the centers and tails of posterior
distributions are sampled, respectively. A high enough bulk ESS (~100 or more, but ideally
100 times the number of chains) indicates that posterior median/mean estimates are reliable,
while a high enough tail ESS (again, ~100 or 100 times the number of chains) indicates that
posterior credible interval estimates are reliable. Rhat measures how well a chain
samples "all" parts of a posterior distribution, so to speak, acting as a
measure of "mixing" (i.e., the sampler converging on the actual posterior distribution).
In this case, lower Rhat's are better, with ~1 being the "best" (a good cutoff is
1.05 or lower, with 1.01 being ideal). This function calculates ESS and Rhat using
functions from the rstan
package, and more details about these quantities can
be found in the Stan manual.
Notably, this function calculates ESS and Rhat's afor each chain. This is helpful for diagnosing problematic behavior in individual chains/parameters, but may not reflect the "healthiness" of the entire fit. To check the ESS and Rhat's for all chains combined, you can use check.ess and check.mix. You should consider your fit "good to go" if the highest combined Rhat doesn't exceed 1.01/1.05 and the lowest combined ESS doesn't drop below 100/100 times the number of chains. Also, combine.chains should only be used if chains properly converged, as indicated by a low enough maximum Rhat.
An array of class "param_block
" with a param_type
of "diagnostics
".
The dimension of these arrays will generally go in the order of diagnostics, then parameters,
then chains. Any dimensions of length 1 are collapsed and stored as attributes. If fit
is a chains param_block
array, parameters are automatically renamed
"diagnostics(<parameter name>)
" to help keep track of parameter manipulations.
See check.ess and check.mix for functions tailored toward
diagnosing the "healthiness" of a fit as a whole. For the rstan
functions
this function relies on, see Rhat.
Other param_block operators:
%chains%()
,
%means%()
,
%quantiles%()
,
%select%()
#get whale/dolphin evorates fit
data("cet_fit")
#extracting directly from evorates fit
cet_fit %diagnostics% "R_mu"
#regular expressions
cet_fit %diagnostics% "R"
#using . is a quick way to extract ALL parameters!
cet_fit %diagnostics% "."
#numeric index-based selection
cet_fit %diagnostics% 1
cet_fit %diagnostics% -1
#select particular diagnostics
cet_fit %diagnostics% list("R_mu", "ess")
cet_fit %diagnostics% list("R_mu", 1)
cet_fit %diagnostics% list("R_mu", -1)
#getting sampler parameters
cet_fit %diagnostics% "lik__"
#warmup samples are still automatically excluded from "lik__" if combined with "R_mu"
#BUT this only affects bulk_ess, tail_ess, and Rhat
#the function DOES extract the actual initial value of the likelihood!
cet_fit %diagnostics% c("R_mu", "lik__")
#extracting from a param_block array
par <- get.bg.rate(fit = cet_fit,
node.groups = setNames(list('Mesoplodon','Orcinus',c('Pseudorca','Feresa')),
c('Mesoplodon','Orca','Globicephalinae')),
)
par %diagnostics% list("Mesoplodon", -1)
#note that inits returns NA
par %diagnostics% list("Mesoplodon", "inits")
#note change in numeric index behavior
par %diagnostics% 1
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.