ggs: Import MCMC samples into a ggs object than can be used by all...

Description Usage Arguments Value References Examples

View source: R/ggs.R

Description

This function manages MCMC samples from different sources (JAGS, MCMCpack, STAN -both via rstan and via csv files-) and converts them into a data frame tibble. The resulting data frame has four columns (Iteration, Chain, Parameter, value) and six attributes (nChains, nParameters, nIterations, nBurnin, nThin and description). The ggs object returned is then used as the input of the ggs_* functions to actually plot the different convergence diagnostics.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
ggs(
  S,
  family = NA,
  description = NA,
  burnin = TRUE,
  par_labels = NA,
  sort = TRUE,
  keep_original_order = FALSE,
  splitting = FALSE,
  inc_warmup = FALSE,
  stan_include_auxiliar = FALSE
)

Arguments

S

Either a mcmc.list object with samples from JAGS, a mcmc object with samples from MCMCpack, a stanreg object with samples from rstanarm, a brmsfit object with samples from brms, a stanfit object with samples from rstan, or a list with the filenames of csv files generated by stan outside rstan (where the order of the files is assumed to be the order of the chains). ggmcmc guesses what is the original object and tries to import it accordingly. rstan is not expected to be in CRAN soon, and so coda::mcmc is used to extract stan samples instead of the more canonical rstan::extract.

family

Name of the family of parameters to process, as given by a character vector or a regular expression. A family of parameters is considered to be any group of parameters with the same name but different numerical value between square brackets (as beta[1], beta[2], etc).

description

Character vector giving a short descriptive text that identifies the model.

burnin

Logical or numerical value. When logical and TRUE (the default), the number of samples in the burnin period will be taken into account, if it can be guessed by the extracting process. Otherwise, iterations will start counting from 1. If a numerical vector is given, the user then supplies the length of the burnin period.

par_labels

data frame with two colums. One named "Parameter" with the same names of the parameters of the model. Another named "Label" with the label of the parameter. When missing, the names passed to the model are used for representation. When there is no correspondence between a Parameter and a Label, the original name of the parameter is used. The order of the levels of the original Parameter does not change.

sort

Logical. When TRUE (the default), parameters are sorted first by family name and then by numerical value.

keep_original_order

Logical. When TRUE, parameters are sorted using the original order provided by the source software. Defaults to FALSE.

splitting

Logical. When TRUE, use the approach suggested by Gelman, Carlin, Stern, Dunson, Vehtari and Rubin (2014) Bayesian Data Analysis. 3rd edition. This implies splitting the sequences (original chains) in half, and treat each half as a different Chain, therefore effectively doubling the number of chains. In this case, the first half of Chain 1 is still Chain 1 , but the second half is turned into Chain 2, and the first half of Chain 2 into Chain 3, and so on. Defaults to FALSE.

inc_warmup

Logical. When dealing with stanfit objects from rstan, logical value whether the warmup samples are included. Defaults to FALSE.

stan_include_auxiliar

Logical value to include "lp__" parameter in rstan, and "lp__", "treedepth__" and "stepsize__" in stan running without rstan. Defaults to FALSE.

Value

D A data frame tibble with the data arranged and ready to be used by the rest of the ggmcmc functions. The data frame has four columns, namely: Iteration, Chain, Parameter and value, and six attributes: nChains, nParameters, nIterations, nBurnin, nThin and description. A data frame tibble is a wrapper to a local data frame, behaves like a data frame and its advantage is related to printing, which is compact. For more details, see as_tibble() in package dplyr.

References

Fernández-i-Marín, Xavier (2016) ggmcmc: Analysis of MCMC Samples and Bayesian Inference. Journal of Statistical Software, 70(9), 1-20. doi:10.18637/jss.v070.i09

Gelman, Carlin, Stern, Dunson, Vehtari and Rubin (2014) Bayesian Data Analysis. 3rd edition. Chapman & Hall/CRC, Boca Raton.

Examples

1
2
3
4
5
6
7
# Assign 'S' to be a data frame suitable for \code{ggmcmc} functions from
# a coda object called s
data(linear)
S <- ggs(s)        # s is a coda object

# Get samples from 'beta' parameters only
S <- ggs(s, family = "beta")

Example output

Loading required package: dplyr

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Loading required package: tidyr
Loading required package: ggplot2
Loading required namespace: coda

ggmcmc documentation built on Feb. 10, 2021, 5:10 p.m.