knitr::opts_chunk$set(echo = TRUE)

Setting

This script applies the proposed ABC with Wasserstein distance, to approximate the posterior distribution in a Normal location model.

The model specifies $Y \sim \mathcal{N}(\mu, \sigma^2)$, the parameters are $(\mu, \sigma)$, the prior is standard Normal on $\mu$, and Gamma$(2,1)$ on $\sigma$. The data are generated from a Gamma$(10,5)$ distribution, with $n= 1000$ observations.

We begin by loading the package, registering multiple cores, setting the random number generator, etc.

# load package
library(winference)
# register parallel cores
registerDoParallel(cores = detectCores())
# remove all
rm(list = ls())
# apply preferences for ggplotting
require(gridExtra)
theme_set(theme_bw())
# set RNG seed
set.seed(11)

Data and model

We generate some data.

# number of observations
nobservations <- 1000
# observations from a Gamma model, with mean 2
obs <- rgamma(nobservations, shape = 10, rate = 5)

We define a model.

# function to generate from prior distribution
# first argument is number of desired samples
# second argument contains hyper-parameters
rprior <- function(N, parameters){
  particles <- matrix(nrow = N, ncol = 2)
  particles[,1] <- rnorm(N, mean = parameters$mu_0, sd = 1/sqrt(parameters$nu))
  particles[,2] <- rgamma(N, shape = parameters$alpha, rate = parameters$beta)
  return(particles)
}
# function to evaluate prior log-density
# first argument is a matrix of parameters (one per row)
# second argument contains hyper-parameters
dprior <- function(thetas, parameters){
  logdensities <- dnorm(thetas[,1], mean = parameters$mu_0, sd = 1/sqrt(parameters$nu), log = TRUE)
  logdensities <- logdensities + dgamma(thetas[,2], shape = parameters$alpha, rate = parameters$beta, log = TRUE)
  return(logdensities)
}
# we specify a data-generating mechanism, given a parameter theta
simulate <- function(theta){
  observations <- theta[1] + rnorm(nobservations) * theta[2]
  return(observations)
}
# we collect everything in a list; the "parameters" list contains the hyper-parameters
target <- list(simulate = simulate, rprior = rprior, dprior = dprior,
               parameter_names = c("mu", "sigma"),
               parameters = list(mu_0=0, nu=1, alpha=2, beta=1),
               thetadim = 2, ydim = 1)

Distance calculation and Monte Carlo algorithm

We define a way of calculating a distance between fake data and the observed data. Here we use the 1-Wasserstein distance, which is the distance between sorted samples.

y_obs_sorted <- sort(obs)
# function to compute 1-Wasserstein distance between observed data and fake data given as argument
compute_distance <- function(y_fake){
  y_fake <- sort(y_fake)
  return(mean(abs(y_obs_sorted - y_fake)))
}

We now specify algorithmic parameters in a list.

# algorithmic parameters: number of particles, number of moves per rejuvenation step,
# proposal distribution, the number of steps to perform in total, the diversity parameter
# used in the threshold adaptation, the number of hits to use in the r-hit kernel,
# and the maximum number of trials to use in the r-hit kernel before rejecting.
param_algo <- list(nthetas = 1024, nmoves = 1, proposal = mixture_rmixmod(),
                   minimum_diversity = 0.5, R = 2, maxtrials = 1e5)

We now run the algorithm.

# now run the algorithm
wsmcresults <- wsmc(compute_distance, target, param_algo, maxtime = 10)

Now we can look at the ouput, with various plots.

# we have access to all the generated particles, distances, and thresholds
names(wsmcresults)
# latest_y contains the latest generated data
# and distances_history stores all the calculated distances, so
tail(wsmcresults$distances_history, n = 1)[[1]][1] == compute_distance(wsmcresults$latest_y[[1]])
# let's plot some of the output, for instance the evolution of the distance thresholds
# and the number of simulations per step
grid.arrange(plot_threshold(wsmcresults), plot_ncomputed(wsmcresults), ncol=2)

We can look at the distributions of parameters.

# and let's look at the parameters themselves
plot_bivariate_polygon(wsmcresults, i1 = 1, i2 = 2)
grid.arrange(plot_marginal(wsmcresults, i = 1), plot_marginal(wsmcresults, i = 2), ncol=2)

Finally, we can proceed to more steps of the algorithm, and plot the resulting output.

# let's do 10 more steps
wsmcresults_continued <- wsmc_continue(wsmcresults, maxtime = 10)
#
plot_bivariate_polygon(wsmcresults_continued, i1 = 1, i2 = 2)
grid.arrange(plot_marginal(wsmcresults_continued, i = 1), plot_marginal(wsmcresults_continued, i = 2), ncol=2)

The Wasserstein ABC posterior has concentrated around a specific region of the parameter space, around the parameter $\theta$ that minimizes the Wasserstein distance between the empirical distribution of the data and the model's distribution.



pierrejacob/winference documentation built on Feb. 17, 2020, 10:28 p.m.