knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  cache = TRUE
)
library(rstan)
library(tidyverse)
library(brms)
library(rstanarm)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

Motivation

Why we're talking about Stan today

\only<1>{ \includegraphics[width=\columnwidth]{part_1_files/trace_good.png} } \only<2>{ \includegraphics[width=\columnwidth]{part_1_files/trace_bad.png} }

Why we're talking about Stan today

Roadmap

Part 1: Introduction to Hamiltonian Monte Carlo and Stan

\vfill \hrule \vfill

Part 2: Alina Ferecatu on hierarchical logit and models of bounded rationality

Part 3: Hernan Bruno on multivariate Tobit and two-stage "hurdle" models

Setup

Common approaches for Bayesian models

Hamiltonian Monte Carlo

Hamiltonian mechanics

Hamiltonian equations


\vspace{12pt}one_particle.mp4

Idealized version of Hamiltonian MC (HMC) sampling

\only<1,3,5>{ \begin{itemize} \item<1->Take a particle and \alert{give it a random shove} (momentum $p$) \begin{itemize} \item Let it move for a while and \alert{then stop it} \item \alert{Record its position} (value of the parameter vector $\theta$) \end{itemize} \item<3->Give it another \alert{random} shove \begin{itemize} \item Let it more for a while and \alert{then stop it} \item \alert{Record its position} again \end{itemize} \item<5->Repeat \end{itemize} } \only<2>{ \vspace{12pt}hmct1.mp4... } \only<4>{ \vspace{12pt}hmct2.mp4... } \only<6>{ \vspace{12pt}hmct3.mp4... } \only<7>{ \vspace{12pt}hmct4.mp4... }

Idealized version of HMC is just that...an ideal

Discretized version of HMC


\vspace{12pt}ani.mp4

Costs and benefits of standard HMC

Detour: What is automatic differentiation?

Comparison of HMC and RW Metropolis

\vspace{12pt}hmc.rw.mp4

Stan

Stan for Hamiltonian Monte Carlo

An inefficient HMC sampler

\vspace{12pt}badL.mp4

Stan's NUTS sampler

Stopping before a U-turn

\centering

\includegraphics[width=.7\columnwidth]{part_1_files/nuts.png}

Basics of a Stan model

parameters {
  real theta;
}
model {
  theta ~ normal(0, 1);
}
sm <- stan_model(model_code = ...)
fit <- sampling(sm)

Output from a basic Stan model

rstan_options(auto_write = TRUE)
readr::write_file(path = here::here('vignettes/part_1_files/sm1.stan'), x = 'parameters {
  real theta;
}
model {
  theta ~ normal(0, 1);
}\n\n')
sm <- stan_model(file = here::here('vignettes/part_1_files/sm1.stan'))
fit <- sampling(sm)
stan_trace(fit)

Bayesian linear regression example

Stan model for linear regression

sm2 <- '
data {
  int<lower = 0> n;
  int<lower = 0> p;
  vector[n] y;
  matrix[n, p] X;
}
parameters {
  real alpha;
  vector[p] beta;
  real<lower = 0> sigma;
}
model {
  alpha ~ normal(0, 1);
  beta ~ normal(0, 1);
  sigma ~ exponential(1);

  y ~ normal(alpha + X * beta, sigma);
}
'
readr::write_file(x = paste0(sm2, '\n\n'), path = here::here('vignettes/part_1_files/sm2.stan'))

\vspace{-18pt}\smaller

cat(readr::read_file(here::here('vignettes/part_1_files/sm2.stan')))

Compiling and sampling from R

sm2 <- stan_model(here::here('vignettes/part_1_files/sm2.stan'))
set.seed(2)
n <- 100
p <- 4
alpha <- 1
beta <- rnorm(p, 0, 1)
sigma <- rexp(1)
X <- matrix(rnorm(n * p), ncol = p)
y <- as.vector(alpha + X %*% beta + rnorm(n, 0, sigma))
fit <- sampling(sm2, data = list(n = n, p = p, X = X, y = y))
library(rstan)
sm <- stan_model(file = 'my_model.stan')

X <- ...
y <- ...
d <- list(n = nrow(X), p = ncol(X), 
          X = X, y = y)

fit <- sampling(sm, data = d)

Trace plots

stan_trace(fit)

Sample autocorrelation

stan_ac(fit)

Posterior means and intervals

stan_plot(fit)

Parts of a Stan model

functions { ... }

data { ... }

transformed data { ... }

parameters { ... }

transformed parameters { ... }

model { ... }

generated quantities { ... }
d <- data_frame(y) %>%
  bind_cols(as_data_frame(X)) %>%
  rename_at(vars(starts_with('V')), funs(str_replace(., 'V', 'X')))

Stan Insideā„¢

What if I don't want to write my own Stan code?

library(rstanarm)
fit <- stan_glm(y ~ 1 + X1 + X2 + X3 + X4, data = d,
           prior = normal(0, 1),
           prior_intercept = normal(0, 1),
           prior_aux = exponential(1))
fit <- stan_glm(y ~ 1 + X1 + X2 + X3 + X4, data = d,
           prior = normal(0, 1),
           prior_intercept = normal(0, 1),
           prior_aux = exponential(1))

\relsize{-2}

summary(fit)
Model Info:

 function:     stan_glm
 family:       gaussian [identity]
 formula:      y ~ 1 + X1 + X2 + X3 + X4
 algorithm:    sampling
 priors:       see help('prior_summary')
 sample:       4000 (posterior sample size)
 observations: 100
 predictors:   5

Estimates:
                mean   sd     2.5%   25%    50%    75%    97.5%
(Intercept)      1.0    0.1    0.8    0.9    1.0    1.1    1.2 
X1              -0.7    0.1   -0.9   -0.8   -0.7   -0.6   -0.4 
X2               0.2    0.1    0.0    0.2    0.2    0.3    0.4 
X3               1.5    0.1    1.3    1.4    1.5    1.6    1.7 
X4              -1.1    0.1   -1.3   -1.2   -1.1   -1.1   -0.9 
sigma            1.1    0.1    1.0    1.1    1.1    1.2    1.3 
mean_PPD         1.1    0.2    0.8    1.0    1.1    1.2    1.4 
log-posterior -161.3    1.8 -165.8 -162.2 -161.0 -160.0 -158.9 

\relsize{-2}

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.0  1.0  4000 
X1            0.0  1.0  4000 
X2            0.0  1.0  4000 
X3            0.0  1.0  4000 
X4            0.0  1.0  4000 
sigma         0.0  1.0  4000 
mean_PPD      0.0  1.0  4000 
log-posterior 0.0  1.0  1724 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude 
measure of effective sample size, and Rhat is the potential scale reduction 
factor on split chains (at convergence Rhat=1).

Density overlays

fit %>% as.array() %>% bayesplot::mcmc_dens_overlay()

Posterior predictive checks

pp_check(fit)

Automatic integration with loo

loo(fit)
Computed from 4000 by 100 log-likelihood matrix

         Estimate   SE
elpd_loo   -157.0  5.4
p_loo         5.5  0.7
looic       314.1 10.9
------
Monte Carlo SE of elpd_loo is 0.0.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.

Conclusion

Why Stan is so important

Roadmap

Part 1: Introduction to Hamiltonian Monte Carlo and Stan

\vfill \hrule \vfill

Part 2: Alina Ferecatu on hierarchical logit and models of bounded rationality

Part 3: Hernan Bruno on multivariate Tobit and two-stage "hurdle" models



jasonmtroos/gentleHMC documentation built on May 14, 2019, 2:06 p.m.