In sieste/doit: Bayesian Computation using Design of Experiments-Based Interpolation Technique

knitr::opts_chunk$set(
  fig.path = paste('figure/', knitr::current_input(), '-figs/', sep=''),
  fig.width = 7,
  fig.height = 5
)

library(doit)
library(tidyverse)
library(mvtnorm)
library(viridis)
library(lhs)

suppressPackageStartupMessages(library(doit))
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(mvtnorm))
suppressPackageStartupMessages(library(viridis))
suppressPackageStartupMessages(library(lhs))

The target function

We use the DoIt method to approximate a 2-dimensional bimodal density which is proportional to a weighted sum of 2 bivariate Normal distributions:

f_target = function(x) {
  2/3 * dmvnorm(x, c(0,0), diag(c(1,1))) + 
  1/3 * dmvnorm(x, c(3,3), matrix(c(2,.6, .6, 1),2,2))
}

df_true = crossing(x1 = seq(-5, 10, .1), x2 = seq(-5,10,.1)) %>%
  mutate(f = f_target(cbind(x1,x2)))

head(df_true)

ggplot(df_true) + 
  geom_raster(aes(x=x1, y=x2, fill=f)) + 
  geom_contour(aes(x=x1, y=x2, z=f), col='black', lty=2) + 
  scale_fill_viridis()

Latin hypercube design

We sample a latin hypercube design on the interval [-5,10] in both dimensions:

set.seed(123)
design = maximinLHS(n=100, k=2) %>% 
         as_data_frame %>% 
         setNames(c('x1', 'x2')) %>%
         mutate(x1 = x1 * 15 - 5, 
                x2 = x2 * 15 - 5,
                f  = f_target(cbind(x1, x2)))
head(design)

ggplot(design) + 
  geom_point(aes(x=x1, y=x2, colour=f), cex=3) + 
  scale_colour_gradient2()

Fit DoIt approximation

We use doit_estimate_w() to estimate the optimal kernel width for our design. We then use doit_fit() to estimate the parameters for the approximation, and doit_approx() to approximate the target function at the desired parameter values:

w          = doit_estimate_w(design)
doit       = doit_fit(design, w=w)
x_eval     = df_true %>% select(x1, x2)
df_approx  = doit_approx(doit, x_eval)

We plot the approximation and compare it to the target function:

plot_df = bind_rows(df_approx %>% rename(f=dens_approx) %>% 
                                  mutate(type='approx'),
                    df_true %>% mutate(type='true'))
ggplot(plot_df, aes(x=x1, y=x2)) + 
  geom_raster(aes(fill=f)) + 
  facet_wrap(~type) +
  geom_point(data=design, pch=1, colour='white') + 
  scale_colour_viridis() + scale_fill_viridis()

Even though only very few design points are in the "active region" of the target function, the approximation looks quite accurate.

We also look at the variance of the approximation:

ggplot(df_approx, aes(x=x1, y=x2)) + 
  geom_raster(aes(fill=variance)) + 
  geom_contour(aes(z=dens_approx), col='black', lty=2) + 
  geom_point(data=design, pch=1, colour='white') + 
  scale_colour_viridis() + scale_fill_viridis()

The approximation variance is highest in between design points inside the active region. These high variance regions will be targeted when adding new design points during sequential updating.

Sequential update of the design

Here we sequentially add 25 new design points, reestimating the kernel width every 5th iteration:

n_new = 25
for (jj in 1:n_new) {
  x_n        = doit_propose_new(doit)
  design_add = x_n %>% mutate(f = f_target(x_n))
  design     = bind_rows(design, design_add)
  if (jj %% 5 != 0) {
    doit = doit_update(doit, design_add)
  } else { 
    w    = doit_estimate_w(design, w_0=doit$w)
    doit = doit_fit(design, w=w)
  }
}

We plot the updated approximation and highlight the new design points:

df_approx = doit_approx(doit, x_eval)
ggplot(mapping=aes(x=x1, y=x2)) +
  geom_raster(data=df_approx, aes(fill=dens_approx)) + 
  geom_point(data=design %>% 
                  mutate(design=rep(c('latin_hypercube','sequential'), c(n()-n_new, n_new))),
             mapping=aes(pch=design, colour=design), cex=2) + 
  scale_fill_viridis()

Lastly, we use doit_marginals() to approximate the marginal densities:

df_marg = doit_marginals(doit, x_eval)
ggplot(df_marg, aes(x=theta, y=dens_approx)) + 
  geom_point() + geom_line() + facet_wrap(~par)

sieste/doit documentation built on May 9, 2019, 4:10 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sieste/doit
Bayesian Computation using Design of Experiments-Based Interpolation Technique

In sieste/doit: Bayesian Computation using Design of Experiments-Based Interpolation Technique

The target function

Latin hypercube design

Fit DoIt approximation

Sequential update of the design

R Package Documentation

Browse R Packages

We want your feedback!

sieste/doit Bayesian Computation using Design of Experiments-Based Interpolation Technique

In sieste/doit: Bayesian Computation using Design of Experiments-Based Interpolation Technique

The target function

Latin hypercube design

Fit DoIt approximation

Sequential update of the design

R Package Documentation

Browse R Packages

We want your feedback!

sieste/doit
Bayesian Computation using Design of Experiments-Based Interpolation Technique