compare.samplers: Compare MCMC samplers on distributions

View source: R/compare-samplers.R

compare.samplersR Documentation

Compare MCMC samplers on distributions

Description

Simulate a set of distributions with a set of samplers and tuning parameters

Usage

compare.samplers(sample.size, dists, samplers, tuning = 1,
                 trace = TRUE, seed = 17, burn.in = 0.2)

Arguments

sample.size

An integer specifying how long a chain to simulate.

dists

A list of scdist objects (often generated by make.dist) specifying the probability distributions to simulate.

samplers

A list of sampler functions. See the section “Sampler calling convention”.

tuning

A numeric vector of tuning parameters

trace

A logical indicating whether a message should be printed when a chain completes (useful for large simulations).

seed

If not null, the random seed is set to this with set.seed before each chain and restored afterwards. This makes each chain individually replicable, useful when debugging.

burn.in

Fraction of chain to discard before computing autocorrelation time.

Details

compare.samplers runs a single Markov chain simulation of length sampler.size size for each combination of the elements of dists, samplers, and tuning. Each chain starts at a point generated by the initial member of the distribution object, or a point uniformly drawn from the unit hypercube if initial is not defined. It returns a data frame with one row per simulation so that performance of the methods can be compared on the various distributions. The simplest way to visualize the results is with the comparison.plot function.

For an example of the use of this method, see the “Introduction to SamplerCompare” vignette. For discussion of the ideas behind it, see Thompson (2010).

Value

A data frame with columns dist, dist.expr, ndim, sampler, sampler.expr, tuning, act, act.025, act.975, act.y, act.y.025, act.y.975, evals, grads, cpu, err, and aborted. Each row represents a single simulation.

  • sampler and dist are the names of the sampler and distribution taken from the lists passed to compare.samplers.

  • sampler.expr and dist.expr are plotmath versions of sampler and dist. If not specified by the distribution object and sampler function, they are constructed from dist and sampler.

  • ndim is the dimension of the state space of the target distribution.

  • tuning is the tuning parameter for the chain.

  • act is the estimated autocorrelation time, taken over all parameters of the simulation; see ar.act. This is more accurate if target.dist$mean is defined.

  • act.025 and act.975 bound a nominal 95% confidence interval for act. Since the interval is asymmetric, a standard error is not sufficient.

  • act.y, act.y.025, and act.y.975 are an estimate and endpoints for a nominal 95% confidence interval for the autocorrelation time of the log density. These are more accurate if target.dist$mean.log.dens is defined.

  • evals and grads are the mean log-density and gradient evaluations per observation.

  • cpu is the number of processor seconds used per observation.

  • err is the two-norm of the difference between the estimated mean and the true mean. Set to NA if the distribution does not specify a true mean.

  • aborted is a logical indicating whether the simulation returned fewer rows than requested.

Sampler calling convention

Sampler functions passed to compare.samplers should be of the form:

sampler(target.dist, x0, sample.size, tuning)

target.dist is a scdist object representing the distribution to sample from; see make.dist for more information on these. x0 is the initial state of the chain; it must be a numeric vector of length target.dist$ndim. sample.size is the desired length of the chain, passed down from compare.samplers. tuning is a scalar tuning parameter from the vector passed to compare.samplers.

Sampler functions should return a list with elements X, evals, and (optionally) grads. X should be a matrix with target.dist$ndim columns and sample.size rows. If for some reason it is necessary to abort the chain, returning fewer rows is acceptable. evals and grads indicate the number of calls to target.dist$log.density and target.dist$grad.log.density respectively.

Sampler functions must have a name attribute with a human-readable name for the MCMC method. If desired, they may also have a name.expression attribute containing a more nicely-formatted version of the name in plotmath format.

See the vignette “Introduction to SamplerCompare” for an example of a function that implements this interface.

References

Thompson, M. B. (2010), Graphical comparison of MCMC performance, University of Toronto Dept. of Statistics technical report no. 1010.

Thompson, M. B. (2011), “Introduction to SamplerCompare,” Journal of Statistical Software 43(12):1-10, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v043.i12")}.

See Also

make.dist, comparison.plot, ar.act, “Introduction to SamplerCompare” (vignette)


SamplerCompare documentation built on April 24, 2023, 9:09 a.m.