knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(sbcrs)
library(rstan)
rstan::rstan_options("auto_write" = TRUE)
get_compiled_stan_model <- function(filename) {
  m <- NULL
  stan_file_loc <- here::here('inst', 'stan', filename)
  if (file.exists(stan_file_loc)) {
    m <- stan_model(file = stan_file_loc, save_dso = TRUE)
  }
  if (is.null(m)) {
    m <- stan_model(file = system.file('stan', filename, package = 'sbcrs'))
  }
  m
}
funnel <- get_compiled_stan_model('funnel.stan')
funnel_reparameterized <- get_compiled_stan_model('funnel_reparameterized.stan')

In this example, we will calibrate two versions of Neal's funnel.[^1]

[^1]: Both versions of the model are from https://mc-stan.org/docs/2_20/stan-users-guide/reparameterization-section.html

$$ p(y,x) = \mathsf{normal}(y|0,3) * \prod_{n=1}^9 \mathsf{normal}(x_n|0,\exp(y/2)) $$

When parameterized in terms of $x_n$ and $y$, this model is extremely difficult for Stan to sample from. Per the Stan User's Guide: "The probability contours are shaped like ten-dimensional funnels. The funnel’s neck is particularly sharp because of the exponential function applied to $y$." Ideally, simulation-based calibration should detect the sampling pathologies in this model.

The stan code for this model (without re-parameterization) is:

cat(readr::read_file(system.file('stan', 'funnel.stan', package = 'sbcrs')))

Compile the Stan model:

funnel <- stan_model(file = system.file('stan', 'funnel.stan', package = 'sbcrs'))

Create an SBC object called sbc1 with functions to: 1) generate simulated values of x and y, and 2) call rstan::sampling, passing in funnel as the model to be sampled.

sbc1 <- SBC$new(
  params = function(seed, data) {
    set.seed(seed + 10)
    y <- rnorm(1, 0, 3)
    x <- rnorm(9, 0, exp(y/2))
    list(y = y, x = x)
  }, 
  sampling = function(seed, data, params, modeled_data, iters) {
    sampling(funnel, data = data, seed = seed, chains = 1, iter = 2 * iters, warmup = iters)
  })

Calibration involves generating data and parameters, and sampling from a Stan model many times. To speed up this process, take advantage of all of your machine's cores.

doParallel::registerDoParallel(cores = parallel::detectCores())

Perform the calibration.

sbc1$calibrate(N = 256, L = 50)
sbc1$plot()
sbc1$plot('y')
sbc1$plot('x')
sbc1$summary()

The model is evidently poorly conditioned.

The reparameterized model is based on new primitives: x_raw and y_raw, which follow independent, standard normal distributions. From these newly defined primitives, it then builds up x and y based on their analytical definitions (see above). The stan code for the reparameterized model is:

cat(readr::read_file(system.file('stan', 'funnel_reparameterized.stan', package = 'sbcrs')))

Compile the (reparameterized) Stan model:

funnel_reparameterized <- stan_model(file = system.file('stan', 'funnel_reparameterized.stan', package = 'sbcrs'))

Create an SBC object called sbc2a with functions to: 1) generate simulated values of x_raw and y_raw, and 2) call rstan::sampling, passing in funnel_reparameterized as the model to be sampled.

sbc2a <- SBC$new(
  params = function(seed, data) {
    set.seed(seed + 10)
    y_raw <- rnorm(1, 0, 1)
    x_raw <- rnorm(9, 0, 1)
    list(y_raw = y_raw, x_raw = x_raw)
  }, 
  sampling = function(seed, data, params, modeled_data, iters) {
    sampling(funnel_reparameterized, data = data, seed = seed, chains = 1, iter = 2 * iters, warmup = iters)
  })

Perform the calibration.

sbc2a$calibrate(N = 256, L = 50)
sbc2a$plot()
sbc2a$plot('y_raw')
sbc2a$plot('x_raw')
sbc2a$summary()

Rather than comparing simulated and sampled values of x_raw and y_raw, we might instead wish to compare x and y. In the code that creates the object sbc2b, note that the params() function returns a list with named values x and y. Hence the calibration is performed on those parameters.

sbc2b <- SBC$new(
  params = function(seed, data) {
    set.seed(seed + 10)
    y_raw <- rnorm(1, 0, 1)
    x_raw <- rnorm(9, 0, 1)
    y <- 3.0 * y_raw;
    x <- exp(y/2) * x_raw;
    list(x = x, y = y)
  }, 
  sampling = function(seed, data, params, modeled_data, iters) {
    sampling(funnel_reparameterized, data = data, seed = seed, chains = 1, iter = 2 * iters, warmup = iters)
  })

Perform the calibration.

sbc2b$calibrate(N = 256, L = 50)
sbc2b$plot()
sbc2b$plot('y')
sbc2b$plot('x')
sbc2b$summary()


jasonmtroos/sbcrs documentation built on Nov. 4, 2019, 2:20 p.m.