discrep.est: Estimate Discrepancy in Calibration Model In laGP: Local Approximate Gaussian Process Regression

Description

Estimates the Gaussian process discrepancy/bias and/or noise term in a modularized calibration of a computer model (emulator) to field data, and returns the log likelihood or posterior probability

Usage

 1 discrep.est(X, Y, Yhat, d, g, bias = TRUE, clean = TRUE)

Arguments

 X a matrix or data.frame containing a design matrix of input locations for field data sites. Any columns of X without at least three unique input settings are dropped in a pre-processing step Y a vector of values with length(Y) = ncol(X) containing the response from field data observations at X. A Y-vector with length(Y) = k*ncol(X), for positive integer k, can be supplied in which case the multiple code Y-values will be treated as replicates at the X-values Yhat a vector with length(Yhat) = length(Y) containing predictions at X from an emulator of a computer simulation d a prior or initial setting for the (single/isotropic) lengthscale parameter in a Gaussian correlation function; a (default) NULL value triggers a sensible regularization (prior) and initial setting to be generated via darg; a scalar specifies an initial value, causing darg to only generate the prior; otherwise, a list or partial list matching the output of darg can be used to specify a custom prior. In the case of a partial list, the only the missing entries will be generated. Note that a default/generated list specifies MLE/MAP inference for this parameter. When specifying initial values, a vector of length nrow(XX) can be provided, giving a different initial value for each predictive location. g a prior or initial setting for the nugget parameter; a NULL value causes a sensible regularization (prior) and initial setting to be generated via garg; a scalar (default g = 1/1000) specifies an initial value, causing garg to only generate the prior; otherwise, a list or partial list matching the output of garg can be used to specify a custom prior. In the case of a partial list, only the missing entries will be generated. Note that a default/generated list specifies no inference for this parameter; i.e., it is fixed at its starting value, which may be appropriate for emulating deterministic computer code output bias a scalar logical indicating if a (isotropic) GP discrepancy should be estimated (TRUE) or a Gaussian noise term only (FALSE) clean a scalar logical indicating if the C-side GP object should be freed before returning.

Details

Estimates an isotropic Gaussian correlation Gaussian process (GP) discrepancy term for the difference between a computer model output (Yhat) and field data observations (Y) at locations X. The computer model predictions would typically come from a GP emulation from simulation data, possibly via aGP if the computer experiment is large.

This function is used primarily as a subroutine by fcalib which defines an objective function for optimization in order to solve the calibration problem via the method described by Gramacy, et al. (2015), designed for large computer experiments. However, once calibration is performed this function can be useful for making comparisons to other methods. Examples are provided in the fcalib documentation.

When bias=FALSE no discrepancy is estimated; only a zero-mean Gaussian error distribution is assumed

Value

The output object is comprised of the output of jmleGP, applied to a GP object built with responses Y - Yhat. That object is augmented with a log likelihood, in \$ll, and with a GP index \$gpi when clean=FALSE. When bias = FALSE the output object retains the same form as above, except with dummy zero-values since calling jmleGP is not required

Note

Note that in principle a separable correlation function could be used (e.g, via newGPsep and mleGPsep), however this is not implemented at this time

Author(s)

Robert B. Gramacy rbg@vt.edu

References

R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R., Journal of Statistical Software, 72(1), 1-46; or see vignette("laGP")

R.B. Gramacy, D. Bingham, JP. Holloway, M.J. Grosskopf, C.C. Kuranz, E. Rutter, M. Trantham, P.R. Drake (2015). Calibrating a large computer experiment simulating radiative shock hydrodynamics. Annals of Applied Statistics, 9(3) 1141-1168; preprint on arXiv:1410.3293 http://arxiv.org/abs/1410.3293

F. Liu, M. Bayarri and J. Berger (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Bayesian Analysis, 4(1) 119-150.