`MCMCvis`

is an R package used to visualize, manipulate, and summarize MCMC output. MCMC output may be derived from Bayesian model output fit with JAGS, Stan, or other MCMC samplers.

The package contains five functions:

`MCMCsummary`

- summarize MCMC output for particular parameters of interest`MCMCpstr`

- summarize MCMC output and extract posterior chains for particular parameters of interest while preserving parameter structure`MCMCtrace`

- create trace and density plots of MCMC chains for particular parameters of interest`MCMCchains`

- easily extract posterior chains from MCMC output for particular parameters of interest`MCMCplot`

- create caterpillar plots from MCMC output for particular parameters of interest

`MCMCvis`

was designed to perform key functions for MCMC analysis using minimal code, in order to free up time/brainpower for interpretation of analysis results. Functions support simple and straightforward subsetting of model parameters within the calls, and produce presentable and 'publication-ready' output.

All functions in the package accept `stanfit`

objects (created with the `rstan`

package), `mcmc.list`

objects (created with the `rjags`

or `coda`

packages), `R2jags`

output (created with the `R2jags`

package), `jagsUI`

output (created with the `jagsUI`

package), and matrices of MCMC output (one chain per column - columns to be named with parameter names). The functions automatically detect the object type and proceed accordingly. Output objects can be inserted directly into the `MCMCvis`

functions as an argument.

library(rjags) #create JAGS model mf <- " model { for (i in 1:10) { y[i] ~ dnorm(mu, 0.01); } mu ~ dnorm(0, 0.01) } " data <- list(y = rnorm(10)) jm <- rjags::jags.model(textConnection(mf), data = data, n.chains = 3) jags_out <- rjags::coda.samples(jm, variable.names = 'mu', n.iter = 10)

```
library(MCMCvis)
```

#plug object directly into package function MCMCsummary(jags_out, round = 2)

## mean sd 2.5% 50% 97.5% Rhat ## mu -0.67 2.95 -6.03 -0.75 4.01 1.19

library(rstan) #create Stan model sm <- " data { real y[10]; } parameters { real mu; } model { for (i in 1:10) { y[i] ~ normal(mu, 10); } mu ~ normal(0, 10); } " stan_out <- stan(model_code = sm, data = data, iter = 5)

#plug object directly into package function MCMCsummary(stan_out, round = 2)

## mean sd 2.5% 50% 97.5% Rhat ## mu -0.26 2.82 -4.49 -0.62 4.45 0.94 ## lp__ -0.44 0.50 -1.47 -0.30 -0.04 1.21

`MCMCsummary`

is used to output summary information from MCMC output. All digits are reported by default.

We'll use the build in `mcmc.list`

object data for the examples below, but model output of any of the supported types will behave in the same way.

data(MCMC_data) MCMCsummary(MCMC_data)

The number of decimal places displayed can be specified with `round`

(except for Rhat which is always rounded to 2 decimal places). Alternatively, the significant digits displayed can be specified with `digits`

.

MCMCsummary(MCMC_data, round = 2)

Specific parameters can be specified to subset summary information. Square brackets in parameter names are ignored by default. For instance, all `alpha`

parameters can be plotted using `params = 'alpha'`

. Output using the `digits`

argument is shown.

MCMCsummary(MCMC_data, params = 'alpha', digits = 2)

Individual parameters can also be specified. For example, one `alpha`

(of many) may be specified. In this case, the square brackets should not be ignored, so that only the `alpha[1]`

parameter can be specified. Use the argument `ISB = FALSE`

to specify particular parameters that contain brackets. ISB is short for 'Ignore Square Brackets'. When `ISB = FALSE`

, the `params`

argument reads like a regular expression. Because of this, the square brackets must be escaped with `\\`

. All other regular expression syntax is accepted, as typically applied in R. A useful cheatsheet for regular expressions in R can be found here. `\\d`

can be used to specify any digits in a particular place.

MCMCsummary(MCMC_data, params = 'alpha\\[1\\]', ISB = FALSE, round = 2)

The `excl`

argument can be used to exclude any parameters. This can be used in conjunction with the `params`

argument. This is particularly useful when specifying `ISB = FALSE`

. For instance, if all `alpha`

parameters are desired **except** for `alpha[1]`

, `params = 'alpha', excl = 'alpha\\[1\\]', ISB = FALSE`

can be used. Once again, since the `params`

argument takes a regular expression, the square brackets must be escaped using `\\`

. When `ISB = TRUE`

, an exact match of the specified parameter is required (excluding the square brackets). When `ISB = FALSE`

, partial names will be matched. Leaving the default (`ISB = TRUE`

) is generally recommended for simplicity. These arguments can be used in any of the functions in the package.

MCMCsummary(MCMC_data, params = 'alpha', excl = 'alpha\\[1\\]', ISB = FALSE, round = 2)

Setting the `Rhat`

and `n.eff`

arguments to `FALSE`

can be used to avoid calculating the Rhat statistic and number of effective samples, respectively (default for `Rhat`

and `n.eff`

are `TRUE`

and `FALSE`

, respectively). Specifying `FALSE`

can greatly increase function speed with very large `mcmc.list`

objects. Values for Rhat near 1 suggest convergence (Brooks and Gelman 1998). Kruschke (2014) recommends n.eff > 10,000 for reasonably stable posterior estimates when using JAGS or BUGS. Substantially fewer effective samples are needed when using Stan due to the algorithm used.

MCMCsummary(MCMC_data, params = 'alpha', Rhat = TRUE, n.eff = TRUE, round = 2)

The `func`

argument can be used to return metrics of interest not already returned by default for `MCMCsummary`

. Input is a function to be performed on posteriors for each specified parameter. Values returned by function will be displayed as a column in the summary output (or multiple columns if the function returns more than one value). In this way, functions from other packages can be used to derive metrics of interest on posterior output. Column name(s) for function output can be specified with the `func_name`

argument.

MCMCsummary(MCMC_data, params = 'alpha', Rhat = TRUE, n.eff = TRUE, func = function(x) quantile(x, probs = c(0.01, 0.99)), func_name = c('1%', '99%'), round = 2)

`MCMCpstr`

is used to output summary information and posterior chains from MCMC output while preserving the original structure of the specified parameters (i.e., scalar, vector, matrix, array). Preserving the original structure can be helpful when plotting or summarizing parameters with multidimensional structure. Particular parameters of interest can be specified as with other functions with the `params`

argument.

Function output has two types. When `type = 'summary'`

(the default), a `list`

with calculated values for each specified parameter is returned, similar to output obtained when fitting models with the `jags.samples`

function (as opposed to `coda.samples`

) from the `rjags`

package.

The function calculates summary information only for the specified function. The function to be used is specified using the `func`

argument.

MCMCpstr(MCMC_data, params = 'alpha', func = mean, type = 'summary')

Custom functions can be specified as well. If the output length of the specified function is greater than 1 when `type = 'summary'`

, an extra dimension is added to the function output. For instance, a `vector`

becomes a `matrix`

, a `matrix`

a three dimensional `array`

, and so forth.

MCMCpstr(MCMC_data, func = function(x) quantile(x, probs = c(0.01, 0.99)))

When `type = 'chains'`

, a `list`

with posterior chain values for each specified parameter is returned. The structure of the parameter is preserved - posterior chain values are concatenated are placed in an additional dimension. For instance, output for a vector parameter will be in `matrix`

format for that element of the `list`

. Similarly, output for a matrix parameter will be in a three dimensional `array`

.

ex <- MCMCpstr(MCMC_data, type = 'chains') dim(ex$alpha)

`MCMCtrace`

is used to create trace and density plots for MCMC output. This is useful for diagnostic purposes. Particular parameters can also be specified, as with `MCMCsummary`

. Output is written to PDF by default to enable more efficient review of posteriors - this also reduces computation time. PDF output is particularly recommended for large numbers of parameters. `pdf = FALSE`

can be used to prevent output to PDF.

MCMCtrace(MCMC_data, params = c('beta\\[1\\]', 'beta\\[2\\]', 'beta\\[3\\]'), ISB = FALSE, pdf = FALSE)

Just trace plot can be plotted with `type = 'trace'`

. Just density plots can be plotted with `type = 'density'`

. Default is `type = 'both'`

which outputs both trace and density plots. Individual chains for the density plot can be output using the `ind`

argument.

MCMCtrace(MCMC_data, params = 'beta', type = 'density', ind = TRUE, pdf = FALSE)

PDF document will be output to the current working directory by default, but another directory can be specified. The `open_pdf`

argument can be used to prevent the produced pdf from opening in a viewer once generated.

MCMCtrace(MCMC_data, pdf = TRUE, open_pdf = FALSE, filename = 'MYpdf', wd = 'DIRECTORY_HERE')

`iter`

denotes how many iterations should be plotted for the chain the trace and density plots. The default is 5000, meaning that the last 5000 iterations of each chain are plotted. Remember, this is the final posterior chain, not including the specified burn-in (specified when the model was run). If less than 5000 iterations are run, the full number of iterations will be plotted.

MCMCtrace(MCMC_data, params = c('beta\\[1\\]', 'beta\\[2\\]', 'beta\\[3\\]'), ISB = FALSE, iter = 100, ind = TRUE, pdf = FALSE)

Overlap between the priors and posteriors can also be calculated by specifying the priors associated with each parameter using the `priors`

argument. This is particularly useful when investigating how large the effect of the prior is on the posterior distribution - this can be informative when trying to determine how identifiable a particular parameter is in a model.

The `priors`

argument takes a matrix as input, with each column representing a prior for a different parameter and each row representing a random draw from that prior distribution. These draws can be generated using R functions such as rnorm, rgamma, runif, etc. Parameters are plotted alphabetically - priors should be sorted accordingly. If the `priors`

argument contains only one prior and more than one parameter is specified for the `params`

argument, this prior will be used for all parameters. The number of draws for each prior should equal the number of iterations specified by \code{iter} (or total draws if less than \code{iter}) times the number of chains, though the function will automatically adjust if more or fewer iterations are specified. It is important to note that some discrepancies between MCMC samplers and R may exist regarding the parameterization of distributions - one example of this is the use of precision in JAGS but standard deviation in R for the 'second parameter' of the normal distribution.

#same prior used for all parameters PR <- rnorm(15000, 0, 32) #equivalent to dnorm(0, 0.001) in JAGS MCMCtrace(MCMC_data, params = c('beta\\[1\\]', 'beta\\[2\\]', 'beta\\[3\\]'), ISB = FALSE, priors = PR, pdf = FALSE)

Plots can be scaled to see both the posterior and the prior distribution using the `post_zm`

argument. Percent overlap can be output to an object as well using the `PPO_out`

argument.

#same prior used for all parameters PR <- rnorm(15000, 0, 32) #equivalent to dnorm(0, 0.001) in JAGS PPO <- MCMCtrace(MCMC_data, params = c('beta\\[1\\]', 'beta\\[2\\]', 'beta\\[3\\]'), ISB = FALSE, priors = PR, pdf = FALSE, post_zm = FALSE, PPO_out = TRUE) PPO

If simulated data was used to fit the model, the generating values used to simulate the data can be specified using the `gvals`

argument. This makes it possible to compare posterior estimates with the true parameter values. Generating values will be displayed as vertical dotted lines. Similar to the `priors`

argument, if one value is specified when more than one parameter is used, this one generating value will be used for all parameters. If the lines are not apparent in the

#generating values for each parameter used to simulate data GV <- c(-10, -5.5, -15) MCMCtrace(MCMC_data, params = c('beta\\[1\\]', 'beta\\[2\\]', 'beta\\[3\\]'), ISB = FALSE, gvals = GV, pdf = FALSE)

`MCMCchains`

is used to extract MCMC chains from MCMC objects. Chains can then be manipulated directly. Particular parameters can be specified as with other functions.

ex <- MCMCchains(MCMC_data, params = 'beta') #extract mean values for each parameter apply(ex, 2, mean)

Using the `mcmc.list`

argument, `MCMCchains`

can return an `mcmc.list`

object, instead of a matrix, for the specified parameters. This can be useful when saving posterior information for only a subset of parameters is desired.

ex2 <- MCMCchains(MCMC_data, params = 'beta', mcmc.list = TRUE)

`MCMCplot`

is used to create caterpillar plots from MCMC output. Points represent posterior medians. Thick lines represent 50 percent credible intervals while thin lines represent 95 percent credible intervals.

As with the other functions in the package, particular parameters of interest can be specified.

MCMCplot(MCMC_data, params = 'beta')

`ref_ovl = TRUE`

can be used to change how the posterior estimates are plotted based on the credible intervals. Parameters where 50% credible intervals overlap 0 are indicated by 'open' circles. Parameters where 50 percent credible intervals DO NOT overlap 0 AND 95 percent credible intervals DO overlap 0 are indicated by 'closed' grey circles. Parameters where 95 percent credible intervals DO NOT overlap 0 are indicated by 'closed' black circles. All median dots are represented as 'closed' black circles. A vertical reference at 0 is plotted by default. The position of this reference line can be modified with the `ref`

argument. `ref = NULL`

removes the reference line altogether.

MCMCplot(MCMC_data, params = 'beta', ref_ovl = TRUE)

Parameters can be ranked by posterior median estimates using the `rank`

argument. `xlab`

can be used to create an alternative label for the x-axis.

MCMCplot(MCMC_data, params = 'beta', rank = TRUE, xlab = 'ESTIMATE')

The orientation of the plot can also be change using the `horiz`

argument. `ylab`

is then used to specify an alternative label on the 'estimate axis'.

MCMCplot(MCMC_data, params = 'beta', rank = TRUE, horiz = FALSE, ylab = 'ESTIMATE')

Graphical parameters for x and y-axis limitation, row labels, title, median dot size, CI line thickness, axis and tick thickness, text size, color of posterior estimates, and margins can be specified.

MCMCplot(MCMC_data, params = 'beta', xlim = c(-60, 40), xlab = 'My x-axis label', main = 'MCMCvis plot', labels = c('First param', 'Second param', 'Third param', 'Fourth param', 'Fifth param', 'Sixth param'), col = 'red', labels_sz = 1.5, med_sz = 2, thick_sz = 7, thin_sz = 3, ax_sz = 4, main_text_sz = 2)

Brooks, S. P., and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7:434.

Kruschke, J. 2014. Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press.

**For more information see ?MCMCvis**

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.