giGP: Global Inducing Point Approximate GP Regression For Many...
In liGP: Locally Induced Gaussian Process Regression

Description Usage Arguments Details Value Author(s) References Examples

Facilitates Gaussian process inference and prediction at a set of predictive locations through the implementation of an inducing point design. Optimizes hyperparameters and returns the moments of the predictive equations.

1 2	giGP(XX, X = NULL, Y = NULL, Xm, g = 1e-6, theta = NULL, nu = NULL, epsK = sqrt(.Machine$double.eps), epsQ = 1e-5, tol = .01, reps = FALSE)

`XX`	a `matrix` of out-of-sample predictive locations with `ncol(XX)=ncol(X)`
`X`	a `matrix` containing the full (large) design matrix of all input locations. If using a list for `reps`, this entry is not used.
`Y`	a vector of all responses/dependent values with `length(Y)=nrow(X)`. If using a list for `reps`, this entry is not used.
`Xm`	a `matrix` containing the inducing points design with `ncol(Xm)=ncol(X)`.
`g`	an initial setting or fixed value for the nugget parameter. In order to optimize the nugget, a list can be provided that includes: `start` – starting value to initialize the nugget `min` – minimum value in the allowable range for the nugget `max` – maximum value in the allowable range for the nugget `ab` – shape and rate parameters specifying a Gamma prior for the nugget If `ab` is not provided, a prior is not placed with the likelihood for optimization. If `min` and `max` aren't provided, the nugget is not optimized. If a single positive scalar is provided, the nugget is fixed for all predictions. If `NULL`, an initial setting is based on `garg` in the `laGP` package.
`theta`	an initial setting or fixed value for the lengthscale parameter. A (default) `NULL` value generates an initial setting based on `darg` in the `laGP` package. Similarly, a list can be provided that includes: `start` – starting value to initialize the lengthscale `min` – minimum value in the allowable range for the lengthscale `max` – maximum value in the allowable range for the lengthscale `ab` – shape and rate parameters specifying a Gamma prior for the lengthscale If `ab` is not provided, a prior is not placed with the likelihood for optimization. If `min` and `max` aren't provided, the lengthscale is not optimized. If a single positive scalar is provided, the lengthscale is fixed for all predictions.
`nu`	a positive number used to set the scale parameter; default (`NULL`) calculates the maximum likelihood estimator
`epsK`	a small positive number added to the diagonal of the correlation `matrix` of inducing points for numerically stability for inversion. The value is automatically increased if needed.
`epsQ`	a small positive number added to the diagonal of the Q `matrix` (see Cole (2021)) for numerically stability for inversion. The value is automatically increased if needed.
`tol`	a positive number to serve as the tolerance level for covergence of the log-likelihood when optimizing the hyperparameter(s) theta, g
`reps`	a notification of replicate design locations in the data set. If `TRUE`, the unique design locations are used for the calculations along with the average response for each unique design location. Alternatively, `reps` can be a list from `find_reps` in the `hetGP` package. In this case, `X` and `Y` are not used.

The function uses the likelihood and predictive equations derived in Snelson and Ghahramani (2006) to fit a induced Gaussian Process for predictions. All the data {X,Y} and inducing points Xm are used for each prediction.

The output is a list with the following components.

`mean`	a vector of predictive means of length `nrow(XX)`
`var`	a vector of predictive variances of length `nrow(XX)`
`nu`	a vector of values of the scale parameter of length `nrow(XX)`
`g`	a full version of the `g` argument
`theta`	a full version of the `theta` argument
`mle`	if `g` and/or `theta` is optimized, a `matrix` containing the values found for these parameters and the number of required iterations, for each predictive location in `XX`
`eps`	a vector of the jitter values used on the correlation matrix and Q matrix
`time`	a scalar giving the passage of wall-clock time elapsed for (substantive parts of) the calculation

D. Austin Cole austin.cole8@vt.edu

D.A. Cole, R.B. Christianson, and R.B. Gramacy (2021). Locally Induced Gaussian Processes for Large-Scale Simulation Experiments Statistics and Computing, 31(3), 1-21; preprint on arXiv:2008.12857; https://arxiv.org/abs/2008.12857

E. Snelson Z. and Ghahramani. (2006). Sparse Gaussian Processes using Pseudo-inputs Advances in Neural Information Processing Systems 18 , 1257-1264.

## "1D Toy Problem"
## Test function from Forrester et al (2008);
library(hetGP); library(lhs)
X <- matrix(seq(0, 1, length=1000))
Y <- f1d(X)
XX <- matrix(seq(0, 1, length=100))
YY <- f1d(XX)
Xm <- randomLHS(10,1)

out <- giGP(XX=XX, X=X, Y=Y, Xm=Xm, theta=.1)
par(mfrow=c(1,2))
plot(X, Y, type='l', lwd=4, ylim=c(-8, 16))
lines(XX, out$mean, lwd=3, lty=2, col=2)
points(Xm, rep(-8, 10), lwd=2, pch=3, col=3)
legend('topleft', legend=c('Test Function', 'Predicted mean', 'Inducing Points'),
       lty=c(1, 2, NA), col=1:3, pch=c(NA, NA, 3), lwd=2)

plot(XX, YY - out$mean, ylab='Error', type = 'l')

##---------------------------------------------------------------------------##
## a "computer experiment"

## Simple 2-d Herbie's Tooth function used in Cole et al (2020);
## thanks to Lee, Gramacy, Taddy, and others who have used it before
library(lhs)
## Build up a design with N=~40K locations
x <- seq(-2, 2, by=0.05)
X <- as.matrix(expand.grid(x, x))
Y <- herbtooth(X)

## Build a inducing point template centered at origin
Xm <- 4*randomLHS(30, 2) - 2

## Predictive grid with N'=400 locations,
xx <- seq(-1.975, 1.975, length=20)
XX <- as.matrix(expand.grid(xx, xx))
YY <- herbtooth(XX)


## Get the predictive equations, first with fixed lengthscale and nugget
out <- giGP(XX=XX, X=X, Y=Y, Xm=Xm, theta=.1)
## RMSE
sqrt(mean((out$mean - YY)^2))


## Refine with optimizing the lengthscale
theta_list <- list(start = .1, min = .05, max = 5)
out2 <- giGP(XX=XX, X=X, Y=Y, Xm=Xm, theta=theta_list)
## RMSE
sqrt(mean((out2$mean - YY)^2))

## Visualize the results
orig_par <- par()
par(mfrow=c(1,3))
image(xx, xx, matrix(out2$mean, nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="Predicted mean")
image(xx, xx, matrix(out2$mean-YY, nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="Bias")
image(xx, xx, sqrt(out2$nu) *matrix(sqrt(out$var), nrow=length(xx)),
      col=heat.colors(128), xlab="x1", ylab="x2", main="Stand. Dev.")
points(Xm, pch=3, col=3, lwd=2)
par(orig_par)