In cbdavis33/bcgp: Bayesian Composite Gaussian Processes

library(bcgp)
knitr::opts_chunk$set(
  echo = FALSE,
  comment = NA,
  fig.align = "center",
  fig.height = 5,
  fig.width = 8
  )

Introduction - The BCGP Model

The Bayesian Composite Gaussian Process (BCGP) model is meant as a version of Gaussian Process (GP) regression. It is intended to predict functions $y(\mathbf{x}), \mathbf{x} \in \mathcal{X}$, that, when viewed as a draw from a stochastic process, exhibit behavior consistent with non-stationarity. The "Composite" part of the BCGP model comes from the decomposition of the random process, $Y(\mathbf{x})$, into a "global" process, $Y_G(\mathbf{x})$, a "local" process, $Y_L(\mathbf{x})$, and an "error" process, $\epsilon(\mathbf{x})$: \begin{equation} Y(\mathbf{x}) = Y_G(\mathbf{x}) + Y_L(\mathbf{x}) + \epsilon(\mathbf{x}) \end{equation}

The global process is meant to be relatively smooth and to capture the overall trend (similar to, say, a polynomial in linear regression). The fact that we model the trend with a GP allows us to not have to specify a global trend function. The local process is less-smooth and makes local adjustments around the global trend, and the error process accounts for measurement error or acts as a nugget for computational stability.

Conditional on model parameters, $\boldsymbol{\Lambda}$, we assume $$Y(\mathbf{x}) | \boldsymbol{\Lambda} \sim GP\left( \beta_0, C\left(\cdot, \cdot \right) \right)$$ $\beta_0$ functions as the overall mean, and $C(\cdot, \cdot)$ is the covariance function. Let $\mathbf{Y} = (y_1, y_2, \ldots, y_n)^\top$ be the observed data, and let $\mathbf{x}$ be the observed data locations in the input space $\mathcal{X}$, a $d$-dimensional finite hyper-rectangle denoted by $[\mathbf{a}, \mathbf{b}]^d = \prod_{j=1}^d[a_j, b_j]$. Let $R(\cdot|\boldsymbol{\rho})$, $\boldsymbol{\rho} = (\rho_1, \ldots, \rho_d)$, $0 \leq \rho_j \leq 1 \;\forall \;j$, be the Gaussian correlation function such that $\mathbf{R}$, an $n \times n$ correlation matrix for the observed data, has elements $$R_{ij} = \prod_{k = 1}^{d}\rho_k^{16\left( x_{ik} - x_{jk} \right)^2}$$ This parameterization is equivalent to the more typical $$R_{ij} = e^{-\sum_{i=1}^d \theta_i \left( x_{ik} - x_{jk} \right)^2}$$ but the $\boldsymbol\rho$ parameterization eases interpretation: $\rho_i$ can be interpreted as the correlation between $Y(\mathbf{x_1})$ and $Y(\mathbf{x_2})$ when $\mathbf{x_1}$ and $\mathbf{x_2}$ differ only in the $i^{th}$ by $\frac{1}{4}$ of a unit.

The Non-Stationary Composite Model

As noted above, the composite GP can be written as the sum of a global process, $Y_G(\mathbf{x})$, a local process, $Y_L(\mathbf{x})$, and an error process, $\epsilon(\mathbf{x})$. The non-stationarity is derived from the fact that the variance of this process is expressed as $\sigma^2(\mathbf{x})$ rather than $\sigma^2$. That is, the variance of the process changes across $\mathcal{X}$, rather than the typical, stationary constant variance. The model parameters are $\Lambda = (\beta_0, w, \boldsymbol{\rho_G}, \boldsymbol{\rho_L}, \sigma^2_\epsilon, \sigma^2(\mathbf{x}), \mu_V, \sigma^2_V, \boldsymbol{\rho_V})^\top$. $\beta_0$ is an overall mean, $w$ is a parameter that weights the global and the local process, $\boldsymbol{\rho_G}$ contains the correlation parameters for the global process, $\boldsymbol{\rho_L}$ contains the correlation parameters for the local process, $\sigma^2_\epsilon$ is the measurement error (or nugget) variance, and $\mu_V, \sigma^2_V$, and $\boldsymbol{\rho_V}$ are the mean, variance, and correlation parameters for $\sigma^2(\mathbf{x})$, which is itself modeled as a log GP.

An example of a draw from this process follows:

paramsNSC <- create_parameter_list()
# paramsNSC$w <- 0.9
paramsNSC$w <- 0.7
paramsNSC$rhoG <- 0.7
paramsNSC$rhoL <- 0.3
paramsNSC$muV <- -0.3
paramsNSC$sig2V <- 0.5
paramsNSC$rhoV <- 0.9
paramsNSC$sig2eps <- 0
seed <- 701687
set.seed(seed)

nsc <- simulate_from_model(stationary = FALSE, composite = TRUE,
                           parameters = paramsNSC, decomposition = TRUE,
                           seed = seed)

dataPlot <- plot(nsc, process = "y", decomposition = TRUE, print = FALSE)
sigmaPlot <- plot(nsc, process = "variance", decomposition = TRUE,
                  print = FALSE)

gridExtra::grid.arrange(dataPlot, sigmaPlot, ncol = 2)

Notice that the global process (blue) captures the general trend of the true process (black) while small-scale adjustments around the trend are made by the local process (green). The variance process $\sigma^2(x)$ in this example increases with x, as we can see in the figure on the right, and we see the amplitude of the fluctuation in the global, local, and overall processes increases accordingly.

Other Models Available in this Package

The model was originally intended to always be composite and non-stationary, and the data was intended to be deterministic output from a computer simulator. The model evolved into having the ability to model data with measurement error, and this R package can fit both stationary and non-stationary and composite and non-composite models and both with and without measurement error.

The Stationary Composite Model

In the non-stationary model, the variance of the model, $\sigma^2(\mathbf{x})$ varies with $\mathbf{x}$. In a stationary model, this variance is constant. That is, the variance does not depend on $\mathbf{x}$, so the variance is the constant, $\sigma^2$. An example of a draw from this process follows:

params <- create_parameter_list(composite = TRUE, stationary = TRUE)
# params$w <- 0.9
params$w <- 0.7
params$rhoG <- 0.7
params$rhoL <- 0.3
params$sigma2 <- 1
params$sig2eps <- 0
seed <- 485268
set.seed(seed)
# nsc <- bcgp:::simulateYGL(stationary = TRUE, parameters = params)
nsc <- simulate_from_model(stationary = TRUE, composite = TRUE,
                           parameters = params, decomposition = TRUE,
                           seed = seed)

plot(nsc, process = "y", decomposition = TRUE, print = FALSE)

Non-Composite Models

For a non-composite model, the random process is decomposed into \begin{equation} Y(\mathbf{x}) = Y_R(\mathbf{x}) + \epsilon(\mathbf{x}) \end{equation}

Non-Stationary Non-Composite

An example of a draw from this process follows:

params <- create_parameter_list(composite = FALSE, stationary = FALSE)
params$rho <- 0.7
params$muV <- -0.3
params$sig2V <- 0.5
params$rhoV <- 0.9
params$sig2eps <- 0
seed <- 511378
set.seed(seed)
nsc <- simulate_from_model(composite = FALSE, stationary = FALSE, 
                           parameters = params, seed = seed)

dataPlot <- plot(nsc, process = "y", print = FALSE)
sigmaPlot <- plot(nsc, process = "variance", print = FALSE)

gridExtra::grid.arrange(dataPlot, sigmaPlot, ncol = 2)

Stationary Non-Composite

This is the most typical model when people think of GP regression. An example of a draw from this process follows:

paramsSNC <- create_parameter_list(composite = FALSE, stationary = TRUE)
paramsSNC$rho <- 0.7
paramsSNC$sigma2 <- 1
paramsSNC$sig2eps <- 0
seed <- 485268
set.seed(seed)
nsc <- simulate_from_model(composite = FALSE, stationary = TRUE, 
                           parameters = paramsSNC, seed = seed)

plot(nsc, process = "y", decomposition = FALSE, print = FALSE)

Models with Measurement Error

All of these models can have measurement error:

seed <- 485268
set.seed(seed)
paramsSNC$sig2eps <- 0.1
snc <- simulate_from_model(composite = FALSE, stationary = TRUE, 
                           parameters = paramsSNC, seed = seed)

sncPlot <- plot(snc, process = "y", decomposition = FALSE, print = FALSE) +
  ggplot2::ggtitle("Stationary Non-Composite BCGP\nWith Measurement Error") 

seed <- 701687
set.seed(seed)
paramsNSC$sig2eps <- 0.005
nsc <- simulate_from_model(stationary = FALSE, composite = TRUE,
                           parameters = paramsNSC, seed = seed)

dataY <- data.frame(x = nsc@training$x, y = nsc@training$y)
predY <- data.frame(x = nsc@test$x, y = nsc@test$y)

nscPlot <- plot(nsc, process = "y", decomposition = FALSE, print = FALSE) +
  ggplot2::ggtitle("Non-Stationary Composite BCGP\nWith Measurement Error") 


gridExtra::grid.arrange(sncPlot, nscPlot, ncol = 2)