mcmc_sampler: MCMC sampler
In mickwar/mwBASE: Collection of functions frequently used by myself

Description Usage Arguments Details Examples

Sample from a target distribution using Metropolis updates

1
2
3

mcmc_sampler(data, target, nparam, nmcmc = 10000, nburn = 10000,
  nthin = 1, window = 200, groups, bounds, chain_init, cand_sig,
  acc_rate = 0.234, k, display = 1000, flag = 1000)

`data`	An R object that is passed to target()
`target`	The log density function of the target distribution, which takes two arguments: the data and the parameters, i.e. target(data, params)
`nparam`	Numeric, the number of parameters (the length of params in target(data, params)). If missing, obtaining the number of parameters is attempted by looking at groups, bounds, or chain_init if they provided.
`nmcmc`	Numeric, the number of iterations the sampler should be run _post_ burn-in (this is not the total number of iterations)
`nburn`	Numeric, the number of iterations the sampler should be as a burn-in
`nthin`	Numeric, every nthin samples will be retained after the burn-in
`window`	Numeric, every window samples (during burn-in) there will be an update to the variance-covariance matrix of the candidate distribution(s) to improve the sampler performance. See autotune().
`groups`	List of numeric vectors corresponding to the indices of the parameters that should be updated jointly. See details.
`bounds`	List containing two named vectors: "lower" and "upper". Each is a numeric vector the same length as the number of parameters being updated. "lower" ides the lower bound and "upper" the upper bound. Defaults to list("lower" = rep(-Inf, nparam), upper = rep(+Inf, nparam)).
`chain_init`	Numeric, same length as the number of parameters, gives the starting point of the chain. Defaults to runif(nparam), since in many cases, possible values for the parameters are in (0, 1].
`cand_sig`	List of diagonal matrices. See details.
`acc_rate`	Numeric within (0, 1), the desired acceptance rate. Passed to autotune(). Defaults to 0.234.
`k`	Numeric greater than 1. The scale parameter passed to autotune(). Defaults to window / 50.
`display`	Numeric, the iteration count is displayed every display samples. Setting display to 0 means the count is not displayed.
`flag`	Numeric, how many attempts should be made to find the starting values when not specified? Defaults to 1000.

This function is intended to be a common implementation of Markov chain Monte Carlo (MCMC) sampling with Metropolis updates. It is assumed that each parameter is characterized by a single value (e.g. a Wishart random variable is not acceptable).

The argument target is a function of two arguments: the data and the parameters. When target(data, params) is called, a single (possibly infinite) value is returned which is the log density for those data and parameters. Caution should be taken if infinite values are returned.

The sampler is run a total of nburn + nmcmc times with the first nburn samples being discarded. The remaining samples are then ‘thinned’ so that every nthin sample is ultimately returned to the user. It is necessary that nmcmc > 0, since because of the ‘autotuning’ process (described later), every draw during the burn-in cannot be assumed to have come from the target distribution.

To understand the group parameter, suppose we have a target distribution with four parameters. Suppose we wish to update the second and third together, but the first and forth by themselves. We would then specify groups = list(1, c(2,3), 4). This will result in the function target() being called three times each iteration as well as in the creation of three proposal distributions (each normal). The first proposal distribution is used to update the first parameter, the second (which will be bivariate) is for updating the second and third parameters, and the third updates the fourth. Leaving groups missing will update all parameters together. To update each parameter separately, set groups = 0.

Proposal distributions are assumed to be the multivariate normal. To avoid the potentially tedious process of tuning the sampler, the covariance matrices for the proposal distributions are tuned automatically. During the burn-in phase only, every window iterations, the acceptance rate is calculated from the most recent samples and the covariances of the proposal distributions are adjusted accordingly. In general, if the acceptance rate is too small this means the proposal covariances are too large, so they will be decreased. The decrease is determined by the function autotune() the covariance of the samples. Similarly, if the acceptance rate is too high, the proposal covariances will be increased.

TODO: Write about cand_sig.

x = runif(100)
y = 1 + 1.5*x + rnorm(100, 0, sqrt(0.1^2))
plot(x, y)

data = list("x" = x, "y" = y)
target = function(data, params){
    x = data$x
    y = data$y
    beta0 = params[1]
    beta1 = params[2]
    sig2 = params[3]

    # Likelihood
    out = sum(dnorm(y, beta0 + beta1*x, sqrt(sig2), log = TRUE))

    # Priors
    out = out + dnorm(beta0, 0, 100, log = TRUE)
    out = out + dnorm(beta1, 0, 100, log = TRUE)
    out = out + dgamma(sig2, 1, 0.1, log = TRUE)

    return (out)
    }

### Update separately
out = mcmc_sampler(data = data, target = target, nparam = 3,
    nmcmc = 10000, nburn = 5000, nthin = 1, window = 200,
    bounds = list("lower" = c(-Inf, -Inf, 0), "upper" = c(Inf, Inf, Inf)),
    display = 100, groups = 0)

colMeans(out$accept)


### Update in groups
out = mcmc_sampler(data = data, target = target, nparam = 3,
    nmcmc = 10000, nburn = 5000, nthin = 1, window = 200,
    bounds = list("lower" = c(-Inf, -Inf, 0), "upper" = c(Inf, Inf, Inf)),
    display = 100, groups = list(c(1,2), 3))

colMeans(out$accept)


### Update jointly
out = mcmc_sampler(data = data, target = target, nparam = 3,
    nmcmc = 10000, nburn = 5000, nthin = 1, window = 200,
    bounds = list("lower" = c(-Inf, -Inf, 0), "upper" = c(Inf, Inf, Inf)),
    display = 100)

mean(out$accept)


hpds = apply(out$params, 2, hpd_mult)
plot_hpd(density(out$params[,1]), hpds[,1], "dodgerblue")
plot_hpd(density(out$params[,2]), hpds[,2], "dodgerblue")
plot_hpd(density(out$params[,3]), hpds[,3], "dodgerblue")