waic | R Documentation |

Details of the WAIC measure for comparing models. NIMBLE implements an online WAIC algorithm, computed during the course of the MCMC iterations.

To use WAIC, set `enableWAIC = TRUE`

when configuring or (if not using
`configureMCMC`

building an MCMC) and set `WAIC = TRUE`

when
calling `nimbleMCMC`

and optionally when calling `runMCMC`

.

By default, NIMBLE calculates WAIC using an online algorithm that updates required summary statistics at each post-burnin iteration of the MCMC.

One can also use `calculateWAIC`

to run an offline version of the
WAIC algorithm after all MCMC sampling has been done. This allows calculation
of WAIC from a matrix (or dataframe) of posterior samples and also retains
compatibility with WAIC in versions of NIMBLE before 0.12.0. However, the
offline algorithm is less flexible than the online algorithm and only
provides conditional WAIC without the ability to group data points. See
`help(calculateWAIC)`

for details.

`controlWAIC`

listThe `controlWAIC`

argument is a list that controls the behavior of the
WAIC algorithm and is passed to either `configureMCMC`

or (if not using
`configureMCMC`

) `buildMCMC`

. One can supply any of the following
optional components:

`online`

: Logical value indicating whether to calculate WAIC during the
course of the MCMC. Default is `TRUE`

and setting to `FALSE`

is
primarily for backwards compatibility to allow use of the old
`calculateWAIC`

method that calculates WAIC from monitored values after
the MCMC finishes.

`dataGroups`

: Optional list specifying grouping of data nodes,
one element per group, with each list element containing the node names
for the data nodes in that group. If provided, the predictive density values
computed will be the joint density values, one joint density per group.
Defaults to one data node per 'group'. See details.

`marginalizeNodes`

: Optional set of nodes (presumably latent nodes)
over which to marginalize to compute marginal WAIC (i.e., WAIC based on a
marginal likelihood), rather than the default conditional WAIC (i.e., WAIC
conditioning on all parent nodes of the data nodes). See details.

`niterMarginal`

: Number of Monte Carlo iterations to use when
marginalizing (default is 1000).

`convergenceSet`

: Optional vector of numbers between 0 and 1 that
specify a set of shorter Monte Carlo simulations for marginal WAIC
calculation as fractions of the full (`niterMarginal`

) Monte Carlo
simulation. If not provided, NIMBLE will use 0.25, 0.50, and 0.75.
NIMBLE will report the WAIC, lppd, and pWAIC that would have been obtained
for these smaller Monte Carlo simulations, allowing assessment of the number
of Monte Carlo samples needed for stable calculation of WAIC.

`thin`

: Logical value for specifying whether to do WAIC calculations
only on thinned samples (default is `FALSE`

). Likely only useful for
reducing computation when using marginal WAIC.

The calculated WAIC and related quantities can be obtained in various ways
depending on how the MCMC is run. If using `nimbleMCMC`

and setting
`WAIC = TRUE`

, see the `WAIC`

component of the output list. If using
`runMCMC`

and setting `WAIC = TRUE`

, either see the `WAIC`

component of the output list or use the `getWAIC`

method of the MCMC
object (in the latter case `WAIC = TRUE`

is not required). If using
the `run`

method of the MCMC object, use the `getWAIC`

method of
the MCMC object.

The output of running WAIC (unless one sets `online = FALSE`

) is a list
containing the following components:

`WAIC`

: The computed WAIC, on the deviance scale. Smaller values are
better when comparing WAIC for two models.

`lppd`

: The log predictive density component of WAIC.

`pWAIC`

: The pWAIC estimate of the effective number of parameters,
computed using the *p*WAIC2 method of Gelman et al. (2014).

To get further information, one can use the `getWAICdetails`

method
of the MCMC object. The result of running `getWAICdetails`

is a list
containing the following components:

`marginal`

: Logical value indicating whether marginal (`TRUE`

) or
conditional (`FALSE`

) WAIC was calculated.

`niterMarginal`

: Number of Monte Carlo iterations used in computing
marginal likelihoods if using marginal WAIC.

`thin`

: Whether WAIC was calculated based only on thinned samples.

`online`

: Whether WAIC was calculated during MCMC sampling.

`WAIC_partialMC`

, `lppd_partialMC`

, `pWAIC_partialMC`

: The
computed marginal WAIC, lppd, and pWAIC based on fewer Monte Carlo
simulations, for use in assessing the sensitivity of the WAIC calculation
to the number of Monte Carlo iterations.

`niterMarginal_partialMC`

: Number of Monte Carlo iterations used for the
values in `WAIC_partialMC`

, `lppd_partialMC`

, `pWAIC_partialMC`

.

`WAIC_elements`

, `lppd_elements`

, `pWAIC_elements`

: Vectors of
individual WAIC, lppd, and pWAIC values, one element per data node (or group
of nodes in the case of specifying `dataGroups`

). Of use in computing
the standard error of the difference in WAIC between two models, following
Vehtari et al. (2017).

As of version 0.12.0, NIMBLE provides enhanced WAIC functionality, with user control over whether to use conditional or marginal versions of WAIC and whether to group data nodes. In addition, users are no longer required to carefully choose MCMC monitors. WAIC by default is now calculated in an online manner (updating the required summary statistics at each MCMC iteration), using all post-burnin samples. The WAIC (Watanabe, 2010) is calculated from Equations 5, 12, and 13 in Gelman et al. (2014) (i.e., using 'pWAIC2').

Note that there is not a unique value of WAIC for a model. By default, WAIC
is calculated conditional on the parent nodes of the data nodes, and the
density values used are the individual density values of the data nodes.
However, by modifying the `marginalizeNodes`

and `dataGroups`

elements of the control list, users can request a marginal WAIC (using a
marginal likelihood that integrates over user-specified latent nodes) and/or
a WAIC based on grouping observations (e.g., all observations in a cluster)
to use joint density values. See the MCMC Chapter of the NIMBLE
User Manual
for more details.

For more detail on the use of different predictive distributions, see Section 2.5 from Gelman et al. (2014) or Ariyo et al. (2019).

Note that based on a limited set of simulation experiments in Hug and Paciorek (2021) our tentative recommendation is that users only use marginal WAIC if also using grouping.

Joshua Hug and Christopher Paciorek

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and
widely applicable information criterion in singular learning theory.
*Journal of Machine Learning Research* 11: 3571-3594.

Gelman, A., Hwang, J. and Vehtari, A. (2014). Understanding predictive
information criteria for Bayesian models.
*Statistics and Computing* 24(6): 997-1016.

Ariyo, O., Quintero, A., Munoz, J., Verbeke, G. and Lesaffre, E. (2019).
Bayesian model selection in linear mixed models for longitudinal data.
*Journal of Applied Statistics* 47: 890-913.

Vehtari, A., Gelman, A. and Gabry, J. (2017). Practical Bayesian model
evaluation using leave-one-out cross-validation and WAIC.
*Statistics and Computing* 27: 1413-1432.

Hug, J.E. and Paciorek, C.J. (2021). A numerically stable online
implementation and exploration of WAIC through variations of the
predictive density, using NIMBLE. *arXiv e-print* <arXiv:2106.13359>.

`calculateWAIC`

`configureMCMC`

`buildMCMC`

`runMCMC`

`nimbleMCMC`

code <- nimbleCode({ for(j in 1:J) { for(i in 1:n) y[j, i] ~ dnorm(mu[j], sd = sigma) mu[j] ~ dnorm(mu0, sd = tau) } sigma ~ dunif(0, 10) tau ~ dunif(0, 10) }) J <- 5 n <- 10 groups <- paste0('y[', 1:J, ', 1:', n, ']') y <- matrix(rnorm(J*n), J, n) Rmodel <- nimbleModel(code, constants = list(J = J, n = n), data = list(y = y), inits = list(tau = 1, sigma = 1)) ## Various versions of WAIC available via online calculation. ## Conditional WAIC without data grouping: conf <- configureMCMC(Rmodel, enableWAIC = TRUE) ## Conditional WAIC with data grouping conf <- configureMCMC(Rmodel, enableWAIC = TRUE, controlWAIC = list(dataGroups = groups)) ## Marginal WAIC with data grouping: conf <- configureMCMC(Rmodel, enableWAIC = TRUE, controlWAIC = list(dataGroups = groups, marginalizeNodes = 'mu')) ## Not run: Rmcmc <- buildMCMC(conf) Cmodel <- compileNimble(Rmodel) Cmcmc <- compileNimble(Rmcmc, project = Rmodel) output <- runMCMC(Cmcmc, niter = 1000, WAIC = TRUE) output$WAIC # direct access ## Alternatively call via the `getWAIC` method; this doesn't require setting ## `waic=TRUE` in `runMCMC` Cmcmc$getWAIC() Cmcmc$getWAICdetails() ## End(Not run)

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.