# Amwg: Turn a non-adaptive Metropolis sampler into an adaptive... In kurtis-s/overature: Tools for Writing MCMC

## Description

Given a non-adaptive sampler of the form f(..., s), `Amwg` will return a function g(...) that automatically adapts the Metropolis proposal standard deviation s to try and achieve a target acceptance rate.

## Usage

 `1` ```Amwg(f, s, batch.size = 50, target = 0.44, DeltaN, stop.after = NA) ```

## Arguments

 `f` non-adaptive Metropolis sampler of the form f(..., s) `s` initial value for the Metropolis proposal SD `batch.size` number of iterations before proposal SD is adapted `target` target acceptance rate `DeltaN` function of the form f(n) which returns the adaptation amount based on the number of elapsed iterations, n. Defaults to δ(n) = min(0.01, n^{-1/2}) `stop.after` stop adapting proposal SD after this many iterations

## Details

`Amwg` uses the Adaptive Metropolis-Within-Gibbs algorithm from Roberts & Rosenthal (2009), which re-scales the proposal standard deviation after a fixed number of MCMC iterations have elapsed. The goal of the algorithm is to achieve a target acceptance rate for the Metropolis step. After the nth batch of MCMC iterations the log of the proposal standard deviation, log(s), is increased/decreased by δ(n). log(s) is increased by δ(n) if the observed acceptance rate is more than the target acceptance rate, or decreased by δ(n) if the observed acceptance rate is less than the target acceptance rate. `Amwg` keeps track of the the acceptance rate by comparing the previously sampled value from `f` to the next value. If the two values are equal, the proposal is considered to be rejected, whereas if the two values are different the proposal is considered accepted. `Amwg` will optionally stop adapting the proposal standard deviation after `stop.after` iterations. Setting `stop.after` can be used, for example, to stop adapting the proposal standard deviation after some burn-in period. If `stop.after=NA` (the default), `Amwg` will continue to modify the proposal standard deviation throughout the entire MCMC.

`DeltaN` is set to δ(n) = min(0.01, n^{-1/2}) unless re-specified in the function call. Some care should be taken if re-specifying `DeltaN`, as the ergodicity of the chain may not be preserved if certain conditions aren't met. See Roberts & Rosenthal (2009) in the references for details.

The proposal standard deviation `s` can be either a vector or a scalar. If the initial value of `s` is a scalar, f will be treated as a sampler for a scalar, a random vector, or a joint parameter update. Alternatively, if the dimension of s is equal to the dimension of the parameters returned by f, the individual elements s will be treated as individual proposal standard deviations for the elements returned by f. This functionality can be used, for example, if f samples each of its returned elements individually, updating each element using a Metropolis step. See the examples for an illustration of this use case. In such settings, f should be constructed to receive s as a vector argument.

## Value

Adaptive Metropolis sampler function of the form g(...).

## References

Gareth O. Roberts & Jeffrey S. Rosenthal (2009) Examples of Adaptive MCMC, Journal of Computational and Graphical Statistics, 18:2, 349-367, doi: 10.1198/jcgs.2009.06134

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121``` ```# Sample from N(1, 2^2) --------------------------------------------------- LogP <- function(x) dnorm(x, 1, 2, log=TRUE) # Target distribution f <- function(x, s) { # Non-adaptive Metropolis sampler x.prop <- x + rnorm(1, 0, s) if(AcceptProp(LogP(x), LogP(x.prop))) { x <- x.prop } return(x) } s.start <- 0.1 g <- Amwg(f, s.start, batch.size=25) n.save <- 10000 Mcmc <- InitMcmc(n.save) y <- 0 x <- 0 samples <- Mcmc({ y <- f(y, s.start) # Non-adaptive x <- g(x) # Adaptive }) plot(1:n.save, samples\$x, ylim=c(-10, 10), main="Traceplots", xlab="Iteration", ylab="Value", type='l') lines(1:n.save, samples\$y, col="red") legend("bottomleft", legend=c("Adaptive", "Non-adaptive"), col=c("black", "red"), lty=1, cex=0.8) # Sample from Gamma(10, 5) ------------------------------------------------ LogP <- function(x) dgamma(x, 10, 5, log=TRUE) # Target distribution f <- function(x, s) { # Non-adaptive Metropolis sampler x.prop <- x + rnorm(1, 0, s) if(AcceptProp(LogP(x), LogP(x.prop))) { x <- x.prop } return(x) } s.start <- 10 stop.after <- 5000 # Stop changing the proposal SD after 5000 iterations g <- Amwg(f, s.start, batch.size=25, stop.after=stop.after) n.save <- 10000 Mcmc <- InitMcmc(n.save) x <- 1 samples <- Mcmc({ x <- g(x) }) hist(samples\$x[stop.after:n.save,], xlab="x", main="Gamma(10, 5)", freq=FALSE) curve(dgamma(x, 10, 5), from=0, to=max(samples\$x), add=TRUE, col="blue") # Overdispersed Poisson --------------------------------------------------- ## Likelihood: ## y_i|theta_i ~ Pois(theta_i), i=1,...,n ## Prior: ## theta_i ~ Log-Normal(mu, sigma^2) ## mu ~ Normal(m, v^2), m and v^2 fixed ## sigma^2 ~ InverseGamma(a, b), a and b fixed SampleSigma2 <- function(theta.vec, mu, a, b, n.obs) { 1/rgamma(1, a + n.obs/2, b + (1/2)*sum((log(theta.vec) - mu)^2)) } SampleMu <- function(theta.vec, sigma.2, m, v.2, n.obs) { mu.var <- (1/v.2 + n.obs/sigma.2)^(-1) mu.mean <- (m/v.2 + sum(log(theta.vec))/sigma.2) * mu.var return(rnorm(1, mu.mean, sqrt(mu.var))) } LogDTheta <- function(theta, mu, sigma.2, y) { dlnorm(theta, mu, sqrt(sigma.2), log=TRUE) + dpois(y, theta, log=TRUE) } # Non-adaptive Metropolis sampler SampleTheta <- function(theta.vec, mu, sigma.2, y.vec, n.obs, s) { theta.prop <- exp(log(theta.vec) + rnorm(n.obs, 0, s)) # Jacobians, because proposals are made on the log scale j.curr <- log(theta.vec) j.prop <- log(theta.prop) log.curr <- LogDTheta(theta.vec, mu, sigma.2, y.vec) + j.curr log.prop <- LogDTheta(theta.prop, mu, sigma.2, y.vec) + j.prop theta.vec <- ifelse(AcceptProp(log.curr, log.prop), theta.prop, theta.vec) return(theta.vec) } ## Data y.vec <- warpbreaks\$breaks n.obs <- length(y.vec) ## Setup adaptive Metropolis sampler s <- rep(1, n.obs) # s is a vector, so the acceptance rate of each component will be tracked # individually in the adaptive Metropolis sampler SampleThetaAdapative <- Amwg(SampleTheta, s) ## Set prior v.2 <- 0.05 m <- log(30) - v.2/2 a <- 1 b <- 2 ## Initialize parameters theta.vec <- y.vec mu <- m ## MCMC Mcmc <- InitMcmc(10000) samples <- Mcmc({ sigma.2 <- SampleSigma2(theta.vec, mu, a, b, n.obs) mu <- SampleMu(theta.vec, sigma.2, m, v.2, n.obs) theta.vec <- SampleThetaAdapative(theta.vec, mu, sigma.2, y.vec, n.obs) }) ```

kurtis-s/overature documentation built on Aug. 10, 2019, 10:28 p.m.