gw_sampler: Goodman-Weare Markov Chain Monte Carlo sampler

Description Usage Arguments Details Value Notes See Also Examples

View source: R/gwmcmc.R

Description

gw_sampler returns posterior samples from an esemble MCMC sampler.

Usage

1
2
3
4
gw_sampler(posterior, theta.0, nsteps = 10000, nwalkers = 100,
  burn.in = 10000, update = 5, chatter = 0, thin = NULL,
  scale.init = NULL, cov.init = NULL, walk.rate = 0, atune = 2,
  stune = NULL, merge.walkers = TRUE, ...)

Arguments

posterior

(function) name of the log(densiy) function to sampling from

theta.0

(vector) initial values of the M variables

nsteps

(integer) total number of samples required

nwalkers

(integer) number of 'walkers' (default: 100; should be > M)

burn.in

(integer) the 'burn-in' period for the ensemble.

update

(integer) print a progress update after update steps

chatter

(integer) how verbose is the output? (0=quiet, 1=normal, 2=verbose)

thin

(integer) keep only every <thin> sample

scale.init

(float) variances for randomising walkers' start positions

cov.init

(matrix) specify exact covariance matrix for randomising walkers' start positions

walk.rate

(float) Fraction of moves (0-1) to make use of the 'walk move' (default: NULL)

atune

(float) set the scale size of the random jumps (default: 2.0)

merge.walkers

(logical) combine output from all walkers into one

...

(anything else) other arguments needed for posterior function

Details

A simple implementation of the ensemble MCMC sampler proposed by Goodman & Weare (2010). Given some function to compute the log of an (un-normalised) M-dimensional posterior density function (PDF) this will produce <nsteps> samples of M-dimensional vectors drawn from the PDF.

Value

A list with components

theta

(array) nsteps samples from M-dimensional posterior [nsteps rows, M columns]

func

(string) name of posterior function sampled

lpost

(vector) nsteps values of the LogPosterior density at each sample position

method

(string) sampling method used ("gwmcmc")

nchains

(integer) number of walkers used

accept.rate

(float) the fraction of proposals accepted.

fcall

Full list of the function call.

If merge.chains = FALSE then theta will be a 3D array with dimensions nchains * (nsteps/nchains) * M.

Notes

The target density function: The target density should is specified by the posterior function. In fact, this function should compute the log density given parameters theta and any other arguments, i.e. log(p) = posterior(theta, ...) where the vector of parameters theta is the first argument of the posterior function. Where the density is zero or not defined, e.g. because prior = 0 for certain values of the parameters, it should return -Inf. Otherwise, the output of posterior(theta, ...) should be a real, scalar value.

The ensemble sampler: It works by running a number nwalkers of 'walkers' through the M-dimensional space. At initialisation, all walkers begin near some start point (specified by theta.0) but have their positions randomised (using a multivariate normal distribution). The ensemble of walkers then updates each cycle. The updating is done by one of two possible moves: the 'stretch move' (handled by the seperate function stretch_move) or the 'walk move' (handled by the seperate function walk_move)

Burn-in: There is an initial period (called the 'burn-in' period) of burn.in steps that from the beginning of the chain that is discarded. This is to help remove memory of the start positions. After a few cycles the ensemble should have found regions of high posterior density even if started from a region of low density. We recommend a minimum of burn.in = 100*nwalkers to give the ensemble a chance to move into and cover the high density region(s).

The chain is then run until there are at least nsteps total samples. Each cycle produces nwalkers samples (one from each walker). So we run for nrows = ceiling(nsteps/nwalkers) cycles after the burn-in period. Any extra cycles can be discarded.

Thinning the output: The user has the option to 'thin' the output. This involves keeping only a subset of the full chain. If thin is equal to 5 then we keep only every 5th cycle. This helps remove autocorrelation between successive cycles. But modern MCMC practice would advise against this as it throws away perfectly good (if autocorrelated) samples.

Once we have enough samples the output from all walkers is merged into a single nsteps (rows) * M (columns) array.

See Also

chain_convergence, mh_sampler, chain_diagnosis, contour_matrix

Examples

1
2
3
4
5
6
7
my_posterior <- function(theta) {
  cov <- matrix(c(1,0.98,0.8,0.98,1.0,0.97,0.8,0.97,2.0), nrow = 3)
  logP <- mvtnorm::dmvnorm(theta, mean = c(-1, 2, 0), sigma = cov, log = TRUE)
  return(logP)
}
chain <- gw_sampler(my_posterior, theta.0 = c(0,0,0), 
                    nsteps = 10e4, burn.in = 1e4) 

svdataman/tonic documentation built on Aug. 2, 2019, 3:21 p.m.