knitr::opts_chunk$set(echo = TRUE)
This script applies the proposed ABC with Wasserstein distance, to approximate the posterior distribution in a Normal location model.
The model specifies $Y \sim \mathcal{N}(\mu, \sigma^2)$, the parameters are $(\mu, \sigma)$, the prior is standard Normal on $\mu$, and Gamma$(2,1)$ on $\sigma$. The data are generated from a Gamma$(10,5)$ distribution, with $n= 1000$ observations.
We begin by loading the package, registering multiple cores, setting the random number generator, etc.
# load package library(winference) # register parallel cores registerDoParallel(cores = detectCores()) # remove all rm(list = ls()) # apply preferences for ggplotting require(gridExtra) theme_set(theme_bw()) # set RNG seed set.seed(11)
We generate some data.
# number of observations nobservations <- 1000 # observations from a Gamma model, with mean 2 obs <- rgamma(nobservations, shape = 10, rate = 5)
We define a model.
# function to generate from prior distribution # first argument is number of desired samples # second argument contains hyper-parameters rprior <- function(N, parameters){ particles <- matrix(nrow = N, ncol = 2) particles[,1] <- rnorm(N, mean = parameters$mu_0, sd = 1/sqrt(parameters$nu)) particles[,2] <- rgamma(N, shape = parameters$alpha, rate = parameters$beta) return(particles) } # function to evaluate prior log-density # first argument is a matrix of parameters (one per row) # second argument contains hyper-parameters dprior <- function(thetas, parameters){ logdensities <- dnorm(thetas[,1], mean = parameters$mu_0, sd = 1/sqrt(parameters$nu), log = TRUE) logdensities <- logdensities + dgamma(thetas[,2], shape = parameters$alpha, rate = parameters$beta, log = TRUE) return(logdensities) } # we specify a data-generating mechanism, given a parameter theta simulate <- function(theta){ observations <- theta[1] + rnorm(nobservations) * theta[2] return(observations) } # we collect everything in a list; the "parameters" list contains the hyper-parameters target <- list(simulate = simulate, rprior = rprior, dprior = dprior, parameter_names = c("mu", "sigma"), parameters = list(mu_0=0, nu=1, alpha=2, beta=1), thetadim = 2, ydim = 1)
We define a way of calculating a distance between fake data and the observed data. Here we use the 1-Wasserstein distance, which is the distance between sorted samples.
y_obs_sorted <- sort(obs) # function to compute 1-Wasserstein distance between observed data and fake data given as argument compute_distance <- function(y_fake){ y_fake <- sort(y_fake) return(mean(abs(y_obs_sorted - y_fake))) }
We now specify algorithmic parameters in a list.
# algorithmic parameters: number of particles, number of moves per rejuvenation step, # proposal distribution, the number of steps to perform in total, the diversity parameter # used in the threshold adaptation, the number of hits to use in the r-hit kernel, # and the maximum number of trials to use in the r-hit kernel before rejecting. param_algo <- list(nthetas = 1024, nmoves = 1, proposal = mixture_rmixmod(), minimum_diversity = 0.5, R = 2, maxtrials = 1e5)
We now run the algorithm.
# now run the algorithm wsmcresults <- wsmc(compute_distance, target, param_algo, maxtime = 10)
Now we can look at the ouput, with various plots.
# we have access to all the generated particles, distances, and thresholds names(wsmcresults) # latest_y contains the latest generated data # and distances_history stores all the calculated distances, so tail(wsmcresults$distances_history, n = 1)[[1]][1] == compute_distance(wsmcresults$latest_y[[1]]) # let's plot some of the output, for instance the evolution of the distance thresholds # and the number of simulations per step grid.arrange(plot_threshold(wsmcresults), plot_ncomputed(wsmcresults), ncol=2)
We can look at the distributions of parameters.
# and let's look at the parameters themselves plot_bivariate_polygon(wsmcresults, i1 = 1, i2 = 2) grid.arrange(plot_marginal(wsmcresults, i = 1), plot_marginal(wsmcresults, i = 2), ncol=2)
Finally, we can proceed to more steps of the algorithm, and plot the resulting output.
# let's do 10 more steps wsmcresults_continued <- wsmc_continue(wsmcresults, maxtime = 10) # plot_bivariate_polygon(wsmcresults_continued, i1 = 1, i2 = 2) grid.arrange(plot_marginal(wsmcresults_continued, i = 1), plot_marginal(wsmcresults_continued, i = 2), ncol=2)
The Wasserstein ABC posterior has concentrated around a specific region of the parameter space, around the parameter $\theta$ that minimizes the Wasserstein distance between the empirical distribution of the data and the model's distribution.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.