alcGP: Improvement statistics for sequential or local design
In laGP: Local Approximate Gaussian Process Regression

alcGP

R Documentation

Improvement statistics for sequential or local design

Description

Calculate the active learning Cohn (ALC) statistic, mean-squared predictive error (MSPE) or expected Fisher information (fish) for a Gaussian process (GP) predictor relative to a set of reference locations, towards sequential design or local search for Gaussian process regression

Usage

alcGP(gpi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), 
      verb = 0)
alcGPsep(gpsepi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), 
      verb = 0)
alcrayGP(gpi, Xref, Xstart, Xend, verb = 0)
alcrayGPsep(gpsepi, Xref, Xstart, Xend, verb = 0)
ieciGP(gpi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0)
ieciGPsep(gpsepi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0)
mspeGP(gpi, Xcand, Xref = Xcand, fi = TRUE, verb = 0)
fishGP(gpi, Xcand)
alcoptGP(gpi, Xref, start, lower, upper, maxit = 100, verb = 0)
alcoptGPsep(gpsepi, Xref, start, lower, upper, maxit = 100, verb = 0)
dalcGP(gpi, Xcand, Xref = Xcand, verb = 0)
dalcGPsep(gpsepi, Xcand, Xref = Xcand, verb = 0)

Arguments

`gpi`	a C-side GP object identifier (positive integer); e.g., as returned by `newGP`
`gpsepi`	a C-side separable GP object identifier (positive integer); e.g., as returned by `newGPsep`
`Xcand`	a `matrix` or `data.frame` containing a design of candidate predictive locations at which the ALC (or other) criteria is (are) evaluated. In the context of `laGP`, these are the possible locations for adding into the current local design
`fmin`	for `ieci*` only: a scalar value indicating the value of the best minimum found so far. This is usually set to the minimum of the `Z`-values stored in the `gpi` or `gpsepi` reference (for deterministic/low nugget settings), or otherwise the predicted mean value at the `X` locations
`Xref`	a `matrix` or `data.frame` containing a design of reference locations for ALC or MSPE. I.e., these are the locations at which the reduction in variance, or mean squared predictive error, are calculated. In the context of `laGP`, this is the single location, or set of reference locations, around which a local design (for accurate prediction) is sought. For `alcrayGP` and `alcrayGPsep` the `matrix` may only have one row, i.e., one reference location
`parallel`	a switch indicating if any parallel calculation of the criteria (`method`) is desired. For `parallel = "omp"`, the package must be compiled with OpenMP flags; for `parallel = "gpu"`, the package must be compiled with CUDA flags (only the ALC criteria is supported on the GPU); see README/INSTALL in the package source for more details
`Xstart`	a `1`-by-`ncol(Xref)` starting location for a search along a ray between `Xstart` and `Xend`
`Xend`	a `1`-by-`ncol(Xref)` ending location for a search along a ray between `Xstart` and `Xend`
`fi`	a scalar logical indicating if the expected Fisher information portion of the expression (MSPE is essentially `ALC + c(x)*EFI`) should be calculated (`TRUE`) or set to zero (`FALSE`). This flag is mostly for error checking against the other functions, `alcGP` and `fishGP`, since the constituent parts are separately available via those functions
`w`	weights on the reference locations `Xref` for IECI calculations; IECI, which stands for Integrated Expected Conditional Improvement, is not fully documented at this time. See Gramacy & Lee (2010) for more details.
`nonug`	a scalar logical indicating if a (nonzero) nugget should be used in the predictive equations behind IECI calculations; this allows the user to toggle improvement via predictive mean uncertainty versus full predictive uncertainty. The latter (default `nonug = FALSE`) is the standard approach, but the former may work better (citation forthcoming)
`verb`	a non-negative integer specifying the verbosity level; `verb = 0` is quiet, and larger values cause more progress information to be printed to the screen
`start`	initial values to the derivative-based search via `"L-BFGS-B"` within `alcoptGP` and `alcoptGPsep`; a nearest neighbor often represents a sensible initialization
`lower, upper`	bounds on the derivative-based search via `"L-BFGS-B"` within `alcoptGP` and `alcoptGPsep`
`maxit`	the maximum number of iterations (default `maxit=100`) in `"L-BFGS-B"` search within `alcoptGP` and `alcoptGPsep`

Details

The best way to see how these functions are used in the context of local approximation is to inspect the code in the laGP.R function.

Otherwise they are pretty self-explanatory. They evaluate the ALC, MSPE, and EFI quantities outlined in Gramacy & Apley (2015). ALC is originally due to Seo, et al. (2000). The ray-based search is described by Gramacy & Haaland (2015).

MSPE and EFI calculations are not supported for separable GP models, i.e., there are no mspeGPsep or fishGPsep functions.

alcrayGP and alcrayGPsep allow only one reference location (nrow(Xref) = 1). alcoptGP and alcoptGPsep allow multiple reference locations. These optimize a continuous ALC analog in its natural logarithm using the starting locations, bounding boxes and (stored) GP provided by gpi or gpisep, and finally snaps the solution back to the candidate grid. For details, see Sun, et al. (2017).

Note that ieciGP and ieciGPsep, which are for optimization via integrated expected conditional improvement (Gramacy & Lee, 2011) are “alpha” functionality and are not fully documented at this time.

Value

Except for alcoptGP, alcoptGPsep, dalcGP, and dalcGPsep, a vector of length nrow(Xcand) is returned filled with values corresponding to the desired statistic

`par`	the best set of parameters/input configuration found on optimization
`value`	the optimized objective value corresponding to output `par`
`its`	a two-element integer vector giving the number of calls to the object function and the gradient respectively.
`msg`	a character string giving any additional information returned by the optimizer, or `NULL`
`convergence`	An integer code. `0` indicates successful completion. For the other error codes, see the documentation for `optim`
`alcs`	reduced predictive variance averaged over the reference locations
`dalcs`	the derivative of `alcs` with respect to the new location

Author(s)

Robert B. Gramacy rbg@vt.edu and Furong Sun furongs@vt.edu

References

Gramacy, R. B. (2020) Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences. Boca Raton, Florida: Chapman Hall/CRC. (See Chapter 9.) https://bobby.gramacy.com/surrogates/

F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; https://arxiv.org/abs/1712.00182

R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R, Journal of Statistical Software, 72(1), 1-46; \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v072.i01")} or see vignette("laGP")

R.B. Gramacy and B. Haaland (2016). Speeding up neighborhood search in local Gaussian process prediction, Technometrics, 58(3), pp. 294-303; preprint on arXiv:1409.0074; https://arxiv.org/abs/1409.0074

R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments, Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; https://arxiv.org/abs/1303.0383

R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression, SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; https://arxiv.org/abs/1310.5182

R.B. Gramacy, H.K.H. Lee (2011). Optimization under unknown constraints, Valencia discussion paper, in Bayesian Statistics 9. Oxford University Press; preprint on arXiv:1004.4027; https://arxiv.org/abs/1004.4027

S. Seo, M., Wallat, T. Graepel, K. Obermayer (2000). Gaussian Process Regression: Active Data Selection and Test Point Rejection, In Proceedings of the International Joint Conference on Neural Networks, vol. III, 241-246. IEEE

Examples

## this follows the example in predGP, but only evaluates 
## information statistics documented here

## Simple 2-d test function used in Gramacy & Apley (2015);
## thanks to Lee, Gramacy, Taddy, and others who have used it before
f2d <- function(x, y=NULL)
  {
    if(is.null(y)) {
      if(!is.matrix(x) && !is.data.frame(x)) x <- matrix(x, ncol=2)
      y <- x[,2]; x <- x[,1]
    }
    g <- function(z)
      return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1)))
    z <- -g(x)*g(y)
  }

## design with N=441
x <- seq(-2, 2, length=11)
X <- expand.grid(x, x)
Z <- f2d(X)

## fit a GP
gpi <- newGP(X, Z, d=0.35, g=1/1000, dK=TRUE)

## predictive grid with NN=400
xx <- seq(-1.9, 1.9, length=20)
XX <- expand.grid(xx, xx)

## predict
alc <- alcGP(gpi, XX)
mspe <- mspeGP(gpi, XX)
fish <- fishGP(gpi, XX)

## visualize the result
par(mfrow=c(1,3))
image(xx, xx, matrix(sqrt(alc), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="sqrt ALC")
image(xx, xx, matrix(sqrt(mspe), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="sqrt MSPE")
image(xx, xx, matrix(log(fish), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="log fish")

## clean up
deleteGP(gpi)


## 
## Illustrating some of the other functions in a sequential design context, 
## using X and XX above
## 

## new, much bigger design
x <- seq(-2, 2, by=0.02)
X <- expand.grid(x, x)
Z <- f2d(X)

## first build a local design of size 25, see laGP documentation
out <- laGP.R(XX, start=6, end=25, X, Z, method="alc", close=10000)

## extract that design and fit GP
XC <- X[out$Xi,] ## inputs
ZC <- Z[out$Xi]  ## outputs
gpi <- newGP(XC, ZC, d=out$mle$d, g=out$g$start)

## calculate the ideal "next" location via continuous ALC optimization
alco <- alcoptGP(gpi=gpi, Xref=XX, start=c(0,0), lower=range(x)[1], upper=range(x)[2])

## alco$par is the "new" location; calculate distances between candidates (remaining
## unchosen X locations) and this solution
Xcan <- X[-out$Xi,]
D <- distance(Xcan, matrix(alco$par, ncol=ncol(Xcan))) 

## snap the new location back to the candidate set
lab <- which.min(D) 
xnew <- Xcan[lab,] 
## add xnew to the local design, remove it from Xcan, and repeat

## evaluate the derivative at this new location
dalc <- dalcGP(gpi=gpi, Xcand=matrix(xnew, nrow=1), Xref=XX)

## clean up
deleteGP(gpi)

laGP documentation built on March 31, 2023, 9:46 p.m.