fit_three_layer  R Documentation 
Conducts MCMC sampling of hyperparameters, hidden layer
z
, and hidden layer w
for a three layer deep GP.
Separate length scale parameters theta_z
,
theta_w
, and theta_y
govern the correlation
strength of the inner layer, middle layer, and outer layer respectively.
Nugget parameter g
governs noise on the outer layer. In Matern
covariance, v
governs smoothness.
fit_three_layer(
x,
y,
nmcmc = 10000,
D = ifelse(is.matrix(x), ncol(x), 1),
verb = TRUE,
w_0 = NULL,
z_0 = NULL,
g_0 = 0.01,
theta_y_0 = 0.1,
theta_w_0 = 0.1,
theta_z_0 = 0.1,
true_g = NULL,
settings = NULL,
cov = c("matern", "exp2"),
v = 2.5,
vecchia = FALSE,
m = min(25, length(y)  1)
)
x 
vector or matrix of input locations 
y 
vector of response values 
nmcmc 
number of MCMC iterations 
D 
integer designating dimension of hidden layers, defaults to
dimension of 
verb 
logical indicating whether to print iteration progress 
w_0 
initial value for hidden layer 
z_0 
initial value for hidden layer 
g_0 
initial value for 
theta_y_0 
initial value for 
theta_w_0 
initial value for 
theta_z_0 
initial value for 
true_g 
if true nugget is known it may be specified here (set to a small value to make fit deterministic). Note  values that are too small may cause numerical issues in matrix inversions. 
settings 
hyperparameters for proposals and priors (see details) 
cov 
covariance kernel, either Matern or squared exponential
( 
v 
Matern smoothness parameter (only used if 
vecchia 
logical indicating whether to use Vecchia approximation 
m 
size of Vecchia conditioning sets (only used if

pmx = TRUE
option not yet implemented for threelayer DGP.
Maps inputs x
through hidden layer z
then hidden
layer w
to outputs y
. Conducts sampling of the hidden
layers using Elliptical Slice sampling. Utilizes Metropolis Hastings
sampling of the length scale and nugget parameters with proposals and
priors controlled by settings
. When true_g
is set to a
specific value, the nugget is not estimated. When
vecchia = TRUE
, all calculations leverage the Vecchia
approximation with specified conditioning set size m
. Vecchia
approximation is only implemented for cov = "matern"
.
NOTE on OpenMP: The Vecchia implementation relies on OpenMP parallelization for efficient computation. This function will produce a warning message if the package was installed without OpenMP (this is the default for CRAN packages installed on Apple machines). To set up OpenMP parallelization, download the package source code and install using the gcc/g++ compiler.
Proposals for g
,
theta_y
, theta_w
, and theta_z
follow a uniform
sliding window scheme, e.g.
g_star < runif(1, l * g_t / u, u * g_t / l)
,
with defaults l = 1
and u = 2
provided in settings
.
To adjust these, set settings = list(l = new_l, u = new_u)
.
Priors on g
, theta_y
, theta_w
, and theta_z
follow Gamma distributions with shape parameters (alpha
) and rate
parameters (beta
) controlled within the settings
list
object. Defaults are
settings$alpha$g < 1.5
settings$beta$g < 3.9
settings$alpha$theta_z < 1.5
settings$beta$theta_z < 3.9 / 4
settings$alpha$theta_w < 1.5
settings$beta$theta_w < 3.9 / 12
settings$alpha$theta_y < 1.5
settings$beta$theta_y < 3.9 / 6
These priors are designed for x
scaled to [0, 1] and y
scaled to have mean 0 and variance 1. These may be adjusted using the
settings
input.
In the current version, the threelayer does not have any equivalent
setting for pmx = TRUE
as in fit_two_layer
.
When w_0 = NULL
and/or z_0 = NULL
, the hidden layers are
initialized at x
(i.e. the identity mapping). The default prior
mean of the inner hidden layer z
is zero, but may be adjusted to x
using settings = list(z_prior_mean = x)
. The prior mean of the
middle hidden layer w
is set at zero is is not user adjustable.
If w_0
and/or z_0
is of dimension nrow(x)  1
by
D
, the final row is predicted using kriging. This is helpful in
sequential design when adding a new input location and starting the MCMC
at the place where the previous MCMC left off.
The output object of class dgp3
or dgp3vec
is designed for
use with continue
, trim
, and predict
.
a list of the S3 class dgp3
or dgp3vec
with elements:
x
: copy of input matrix
y
: copy of response vector
nmcmc
: number of MCMC iterations
settings
: copy of proposal/prior settings
v
: copy of Matern smoothness parameter (v = 999
indicates cov = "exp2"
)
g
: vector of MCMC samples for g
theta_y
: vector of MCMC samples for theta_y
(length
scale of outer layer)
theta_w
: matrix of MCMC samples for theta_w
(length
scale of middle layer)
theta_z
: matrix of MCMC samples for theta_z
(length
scale of inner layer)
tau2
: vector of MLE estimates for tau2
(scale
parameter of outer layer)
w
: list of MCMC samples for middle hidden layer w
z
: list of MCMC samples for inner hidden layer z
ll
: vector of MVN log likelihood of the outer layer
for reach Gibbs iteration
time
: computation time in seconds
Sauer, A. (2023). Deep Gaussian process surrogates for computer experiments.
*Ph.D. Dissertation, Department of Statistics, Virginia Polytechnic Institute and State University.*
Sauer, A., Gramacy, R.B., & Higdon, D. (2023). Active learning for deep
Gaussian process surrogates. *Technometrics, 65,* 418. arXiv:2012.08015
Sauer, A., Cooper, A., & Gramacy, R. B. (2022). Vecchiaapproximated deep
Gaussian processes for computer experiments.
*Journal of Computational and Graphical Statistics,* 114. arXiv:2204.02904
# Examples of realworld implementations are available at:
# https://bitbucket.org/gramacylab/deepgpex/
# G function (https://www.sfu.ca/~ssurjano/gfunc.html)
f < function(xx, a = (c(1:length(xx))  1) / 2) {
new1 < abs(4 * xx  2) + a
new2 < 1 + a
prod < prod(new1 / new2)
return((prod  1) / 0.86)
}
# Training data
d < 2
n < 30
x < matrix(runif(n * d), ncol = d)
y < apply(x, 1, f)
# Testing data
n_test < 100
xx < matrix(runif(n_test * d), ncol = d)
yy < apply(xx, 1, f)
i < interp::interp(xx[, 1], xx[, 2], yy)
image(i, col = heat.colors(128))
contour(i, add = TRUE)
points(x)
# Example 1: full model (nugget estimated)
fit < fit_three_layer(x, y, nmcmc = 2000)
plot(fit)
fit < trim(fit, 1000, 2)
fit < predict(fit, xx, cores = 1)
plot(fit)
# Example 2: Vecchia approximated model (nugget fixed)
# (Vecchia approximation is faster for larger data sizes)
fit < fit_three_layer(x, y, nmcmc = 2000, vecchia = TRUE,
m = 10, true_g = 1e6)
plot(fit)
fit < trim(fit, 1000, 2)
fit < predict(fit, xx, cores = 1)
plot(fit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.