MCMCsim: Run a Markov Chain Monte Carlo simulation

View source: R/MCMCsim.R

MCMCsimR Documentation

Run a Markov Chain Monte Carlo simulation

Description

Given a sampler object this function runs a MCMC simulation and stores the posterior draws. A sampler object for a wide class of multilevel models can be created using create_sampler, but users can also define their own sampler functions, see below. MCMCsim allows to choose the parameters for which simulation results must be stored. It is possible to define derived quantities that will also be stored. To save memory, it is also possible to only store Monte Carlo means/standard errors for some large vector parameters, say. Another way to use less memory is to save the simulation results of large vector parameters to file. For parameters specified in plot.trace trace plots or pair plots of multiple parameters are displayed during the simulation.

Usage

MCMCsim(
  sampler,
  from.prior = FALSE,
  n.iter = 1000L,
  n.chain = 3L,
  thin = 1L,
  burnin = if (from.prior) 0L else 250L,
  start = NULL,
  store,
  store.all = FALSE,
  pred = NULL,
  store.mean,
  store.sds = FALSE,
  to.file = NULL,
  filename = "MCdraws_",
  write.single.prec = FALSE,
  verbose = TRUE,
  n.progress = n.iter%/%10L,
  trace.convergence = NULL,
  stop.on.convergence = FALSE,
  convergence.bound = 1.05,
  plot.trace = NULL,
  add.to.plot = TRUE,
  plot.type = "l",
  n.cores = 1L,
  cl = NULL,
  seed = NULL,
  export = NULL
)

Arguments

sampler

sampler object created by create_sampler.

from.prior

whether to sample from the prior. By default from.prior=FALSE and samples are taken from the posterior.

n.iter

number of draws after burnin.

n.chain

number of independent chains.

thin

only every thin'th draw is kept.

burnin

number of draws to discard at the beginning of each chain.

start

an optional function to generate starting values or a list containing for each chain a named list of starting values. It may be used to provide starting values for some or all parameters. The sampler object's own start function, if it exists, is called to generate any starting values not provided by the user.

store

vector of names of parameters to store MCMC draws for. By default, simulations are stored for all parameters returned by sampler$store_default.

store.all

if TRUE simulation vectors of all parameters returned by the sampling function of sampler will be stored. The default is FALSE, and in that case only simulations for the parameters named in store are stored.

pred

list of character strings defining derived quantities to be computed (and stored) for each draw.

store.mean

vector of names of parameters for which only the mean (per chain) is to be stored. This may be useful for large vector parameters (e.g. regression residuals) for which storing complete MCMC output would use too much memory. The function sampler$store_mean_default exists it provides the default.

store.sds

if TRUE store for all parameters in store.mean, besides the mean, also the standard deviation. Default is FALSE.

to.file

vector of names of parameters to write to file.

filename

name of file to write parameter draws to. Each named parameter is written to a separate file, named filename_parametername.

write.single.prec

Whether to write to file in single precision. Default is FALSE.

verbose

if FALSE no output is sent to the screen during the simulation. TRUE by default.

n.progress

print iteration number and diagnostics and update plots after so many iterations.

trace.convergence

vector of names of parameters for which Gelman-Rubin R-hat diagnostics are printed to the screen every n.progress iterations.

stop.on.convergence

if TRUE stop the simulation if the R-hat diagnostics for all parameters in trace.convergence are less than convergence.bound.

convergence.bound

threshold used with stop.on.convergence.

plot.trace

character vector of parameter names for which to plot draws during the simulation. For one or two parameters trace plots will be shown, and if more parameters are specified the results will be displayed in a pairs plot. For vector parameters a specific component can be selected using brackets, e.g. "beta[2]".

add.to.plot

if TRUE the plot is updated every n.progress iterations, otherwise a new plot (with new scales) is created after every n.progress iterations.

plot.type

default is "l" (lines).

n.cores

the number of cpu cores to use. Default is one, i.e. no parallel computation. If an existing cluster cl is provided, n.cores will be set to the number of workers in that cluster.

cl

an existing cluster can be passed for parallel computation. If NULL and n.cores > 1, a new cluster is created.

seed

a random seed (integer). For parallel computation it is used to independently seed RNG streams for all workers.

export

a character vector with names of objects to export to the workers. This may be needed for parallel execution if expressions in pred depend on global variables.

Details

A sampler object is an environment containing data and functions to use for sampling. The following elements of the sampler object are used by MCMCsim:

start

function to generate starting values.

draw

function to draw samples, typically from a full conditional posterior distribution.

rprior

function to draw from a prior distribution.

coef.names

list of vectors of parameter coefficient names, for vector parameters.

MHpars

vector of names of parameters that are sampled using a Metropolis-Hastings (MH) sampler; acceptance rates are kept for these parameters.

adapt

function of acceptance rates of MHpars to adapt MH-kernel, called every 100 iterations during the burn-in period.

Value

An object of class mcdraws containing posterior draws as well as some meta information.

Examples

# 1. create a sampler function
sampler <- new.env()
sampler$draw <- function(p) list(x=rnorm(1L), y=runif(1L))
# 2. do the simulation
sim <- MCMCsim(sampler, store=c("x", "y"))
str(sim)
summary(sim)

# example that requires start values or a start function
sampler$draw <- function(p) list(x=rnorm(1L), y=p$x * runif(1L))
sampler$start <- function(p) list(x=rnorm(1L), y=runif(1L))
sim <- MCMCsim(sampler, store=c("x", "y"))
summary(sim)
plot(sim, c("x", "y"))

# example using create_sampler; first generate some data
n <- 100
dat <- data.frame(x=runif(n), f=as.factor(sample(1:4, n, replace=TRUE)))
gd <- generate_data(~ reg(~ x + f, prior=pr_normal(precision=1), name="beta"), data=dat)
dat$y <- gd$y
sampler <- create_sampler(y ~ x + f, data=dat)
sim <- MCMCsim(sampler, burnin=100, n.iter=400, n.chain=2)
(summary(sim))
gd$pars


mcmcsae documentation built on Oct. 11, 2023, 1:06 a.m.