alcGP: Improvement statistics for sequential or local design In laGP: Local Approximate Gaussian Process Regression

Description

Calculate the active learning Cohn (ALC) statistic, mean-squared predictive error (MSPE) or expected Fisher information (fish) for a Gaussian process (GP) predictor relative to a set of reference locations, towards sequential design or local search for Gaussian process regression

Usage

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 alcGP(gpi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), verb = 0) alcGPsep(gpsepi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), verb = 0) alcrayGP(gpi, Xref, Xstart, Xend, verb = 0) alcrayGPsep(gpsepi, Xref, Xstart, Xend, verb = 0) ieciGP(gpi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0) ieciGPsep(gpsepi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0) mspeGP(gpi, Xcand, Xref = Xcand, fi = TRUE, verb = 0) fishGP(gpi, Xcand) alcoptGP(gpi, Xref, start, lower, upper, maxit = 100, verb = 0) alcoptGPsep(gpsepi, Xref, start, lower, upper, maxit = 100, verb = 0) dalcGP(gpi, Xcand, Xref = Xcand, verb = 0) dalcGPsep(gpsepi, Xcand, Xref = Xcand, verb = 0)

Arguments

 gpi a C-side GP object identifier (positive integer); e.g., as returned by newGP gpsepi a C-side separable GP object identifier (positive integer); e.g., as returned by newGPsep Xcand a matrix or data.frame containing a design of candidate predictive locations at which the ALC (or other) criteria is (are) evaluated. In the context of laGP, these are the possible locations for adding into the current local design fmin for ieci* only: a scalar value indicating the value of the best minimum found so far. This is usually set to the minimum of the Z-values stored in the gpi or gpsepi reference (for deterministic/low nugget settings), or otherwise the predicted mean value at the X locations Xref a matrix or data.frame containing a design of reference locations for ALC or MSPE. I.e., these are the locations at which the reduction in variance, or mean squared predictive error, are calculated. In the context of laGP, this is the single location, or set of reference locations, around which a local design (for accurate prediction) is sought. For alcrayGP and alcrayGPsep the matrix may only have one row, i.e., one reference location parallel a switch indicating if any parallel calculation of the criteria (method) is desired. For parallel = "omp", the package must be compiled with OpenMP flags; for parallel = "gpu", the package must be compiled with CUDA flags (only the ALC criteria is supported on the GPU); see README/INSTALL in the package source for more details Xstart a 1-by-ncol(Xref) starting location for a search along a ray between Xstart and Xend Xend a 1-by-ncol(Xref) ending location for a search along a ray between Xstart and Xend fi a scalar logical indicating if the expected Fisher information portion of the expression (MSPE is essentially ALC + c(x)*EFI) should be calculated (TRUE) or set to zero (FALSE). This flag is mostly for error checking against the other functions, alcGP and fishGP, since the constituent parts are separately available via those functions w weights on the reference locations Xref for IECI calculations; IECI, which stands for Integrated Expected Conditional Improvement, is not fully documented at this time. See Gramacy & Lee (2010) for more details. nonug a scalar logical indicating if a (nonzero) nugget should be used in the predictive equations behind IECI calculations; this allows the user to toggle improvement via predictive mean uncertainty versus full predictive uncertainty. The latter (default nonug = FALSE) is the standard approach, but the former may work better (citation forthcoming) verb a non-negative integer specifying the verbosity level; verb = 0 is quiet, and larger values cause more progress information to be printed to the screen start initial values to the derivative-based search via "L-BFGS-B" within alcoptGP and alcoptGPsep; a nearest neighbor often represents a sensible initialization lower, upper bounds on the derivative-based search via "L-BFGS-B" within alcoptGP and alcoptGPsep maxit the maximum number of iterations (default maxit=100) in "L-BFGS-B" search within alcoptGP and alcoptGPsep

Details

The best way to see how these functions are used in the context of local approximation is to inspect the code in the laGP.R function.

Otherwise they are pretty self-explanatory. They evaluate the ALC, MSPE, and EFI quantities outlined in Gramacy & Apley (2015). ALC is originally due to Seo, et al. (2000). The ray-based search is described by Gramacy & Haaland (2015).

MSPE and EFI calculations are not supported for separable GP models, i.e., there are no mspeGPsep or fishGPsep functions.

alcrayGP and alcrayGPsep allow only one reference location (nrow(Xref) = 1). alcoptGP and alcoptGPsep allow multiple reference locations. These optimize a continuous ALC analog in its natural logarithm using the starting locations, bounding boxes and (stored) GP provided by gpi or gpisep, and finally snaps the solution back to the candidate grid. For details, see Sun, et al. (2017).

Note that ieciGP and ieciGPsep, which are for optimization via integrated expected conditional improvement (Gramacy & Lee, 2011) are “alpha” functionality and are not fully documented at this time.

Value

Except for alcoptGP, alcoptGPsep, dalcGP, and dalcGPsep, a vector of length nrow(Xcand) is returned filled with values corresponding to the desired statistic

 par the best set of parameters/input configuration found on optimization value the optimized objective value corresponding to output par its a two-element integer vector giving the number of calls to the object function and the gradient respectively. msg a character string giving any additional information returned by the optimizer, or NULL convergence An integer code. 0 indicates successful completion. For the other error codes, see the documentation for optim alcs reduced predictive variance averaged over the reference locations dalcs the derivative of alcs with respect to the new location

Author(s)

Robert B. Gramacy rbg@vt.edu and Furong Sun furongs@vt.edu

References

F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; http://arxiv.org/abs/1712.00182

R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R, Journal of Statistical Software, 72(1), 1-46; or see vignette("laGP")

R.B. Gramacy and B. Haaland (2016). Speeding up neighborhood search in local Gaussian process prediction, Technometrics, 58(3), pp. 294-303; preprint on arXiv:1409.0074; http://arxiv.org/abs/1409.0074

R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments, Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; http://arxiv.org/abs/1303.0383

R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression, SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; http://arxiv.org/abs/1310.5182

R.B. Gramacy, H.K.H. Lee (2011). Optimization under unknown constraints, Valencia discussion paper, in Bayesian Statistics 9. Oxford University Press; preprint on arXiv:1004.4027; http://arxiv.org/abs/1004.4027

S. Seo, M., Wallat, T. Graepel, K. Obermayer (2000). Gaussian Process Regression: Active Data Selection and Test Point Rejection, In Proceedings of the International Joint Conference on Neural Networks, vol. III, 241-246. IEEE