fit_one_layer: MCMC sampling for one layer GP

View source: R/fit.R

fit_one_layerR Documentation

MCMC sampling for one layer GP


Conducts MCMC sampling of hyperparameters for a one layer GP. Length scale parameter theta governs the strength of the correlation and nugget parameter g governs noise. In Matern covariance, v governs smoothness.


  nmcmc = 10000,
  sep = FALSE,
  verb = TRUE,
  g_0 = 0.01,
  theta_0 = 0.1,
  true_g = NULL,
  settings = NULL,
  cov = c("matern", "exp2"),
  v = 2.5,
  vecchia = FALSE,
  m = min(25, length(y) - 1),
  ordering = NULL



vector or matrix of input locations


vector of response values


number of MCMC iterations


logical indicating whether to use separable (sep = TRUE) or isotropic (sep = FALSE) lengthscales


logical indicating whether to print iteration progress


initial value for g


initial value for theta


if true nugget is known it may be specified here (set to a small value to make fit deterministic). Note - values that are too small may cause numerical issues in matrix inversions.


hyperparameters for proposals and priors (see details)


covariance kernel, either Matern or squared exponential ("exp2")


Matern smoothness parameter (only used if cov = "matern")


logical indicating whether to use Vecchia approximation


size of Vecchia conditioning sets (only used if vecchia = TRUE)


optional ordering for Vecchia approximation, must correspond to rows of x, defaults to random


Utilizes Metropolis Hastings sampling of the length scale and nugget parameters with proposals and priors controlled by settings. When true_g is set to a specific value, the nugget is not estimated. When vecchia = TRUE, all calculations leverage the Vecchia approximation with specified conditioning set size m. Vecchia approximation is only implemented for cov = "matern".

NOTE on OpenMP: The Vecchia implementation relies on OpenMP parallelization for efficient computation. This function will produce a warning message if the package was installed without OpenMP (this is the default for CRAN packages installed on Apple machines). To set up OpenMP parallelization, download the package source code and install using the gcc/g++ compiler.

Proposals for g and theta follow a uniform sliding window scheme, e.g.

g_star <- runif(1, l * g_t / u, u * g_t / l),

with defaults l = 1 and u = 2 provided in settings. To adjust these, set settings = list(l = new_l, u = new_u). Priors on g and theta follow Gamma distributions with shape parameters (alpha) and rate parameters (beta) controlled within the settings list object. Defaults are

  • settings$alpha$g <- 1.5

  • settings$beta$g <- 3.9

  • settings$alpha$theta <- 1.5

  • settings$beta$theta <- 3.9 / 1.5

These priors are designed for x scaled to [0, 1] and y scaled to have mean 0 and variance 1. These may be adjusted using the settings input.

The output object of class gp is designed for use with continue, trim, and predict.


a list of the S3 class gp or gpvec with elements:

  • x: copy of input matrix

  • y: copy of response vector

  • nmcmc: number of MCMC iterations

  • settings: copy of proposal/prior settings

  • v: copy of Matern smoothness parameter (v = 999 indicates cov = "exp2")

  • g: vector of MCMC samples for g

  • theta: vector of MCMC samples for theta

  • tau2: vector of MLE estimates for tau2 (scale parameter)

  • ll: vector of MVN log likelihood for each Gibbs iteration

  • time: computation time in seconds


Sauer, A. (2023). Deep Gaussian process surrogates for computer experiments. *Ph.D. Dissertation, Department of Statistics, Virginia Polytechnic Institute and State University.*

Sauer, A., Gramacy, R.B., & Higdon, D. (2023). Active learning for deep Gaussian process surrogates. *Technometrics, 65,* 4-18. arXiv:2012.08015

Sauer, A., Cooper, A., & Gramacy, R. B. (2023). Vecchia-approximated deep Gaussian processes for computer experiments. *Journal of Computational and Graphical Statistics,* 1-14. arXiv:2204.02904


# Additional examples including real-world computer experiments are available at: 

# G function (
f <- function(xx, a = (c(1:length(xx)) - 1) / 2) { 
    new1 <- abs(4 * xx - 2) + a
    new2 <- 1 + a
    prod <- prod(new1 / new2)
    return((prod - 1) / 0.86)

# Training data
d <- 1 
n <- 20
x <- matrix(runif(n * d), ncol = d)
y <- apply(x, 1, f)

# Testing data
n_test <- 100
xx <- matrix(runif(n_test * d), ncol = d)
yy <- apply(xx, 1, f)

plot(xx[order(xx)], yy[order(xx)], type = "l")
points(x, y, col = 2)

# Example 1: full model (nugget fixed)
fit <- fit_one_layer(x, y, nmcmc = 2000, true_g = 1e-6)
fit <- trim(fit, 1000, 2)
fit <- predict(fit, xx, cores = 1)

# Example 2: full model (nugget estimated, EI calculated)
fit <- fit_one_layer(x, y, nmcmc = 2000)
fit <- trim(fit, 1000, 2)
fit <- predict(fit, xx, cores = 1, EI = TRUE)
par(new = TRUE) # overlay EI
plot(xx[order(xx)], fit$EI[order(xx)], type = 'l', lty = 2, 
      axes = FALSE, xlab = '', ylab = '')
# Example 3: Vecchia approximated model
fit <- fit_one_layer(x, y, nmcmc = 2000, vecchia = TRUE, m = 10) 
fit <- trim(fit, 1000, 2)
fit <- predict(fit, xx, cores = 1)

deepgp documentation built on May 29, 2024, 10 a.m.