gekm: Fitting (Gradient-Enhanced) Kriging Models

View source: R/gekm.R

gekmR Documentation

Fitting (Gradient-Enhanced) Kriging Models

Description

Estimation of a Kriging model with or without derivatives.

Usage

gekm(formula, data, deriv, covtype = c("matern5_2", "matern3_2", "gaussian"),
	theta = NULL, tol = NULL, optimizer = c("NMKB", "L-BFGS-B"), 
	lower = NULL, upper = NULL, start = NULL, ncalls = 20, control = NULL,
	model = TRUE, x = FALSE, y = FALSE, dx = FALSE, dy = FALSE, ...)
	
## S3 method for class 'gekm'
print(x, digits = 4L, ...)

Arguments

formula

a formula that defines the regression functions. Note that only formula expressions for which the derivations are contained in the derivatives table of deriv are supported in the gradient-enhanced Kriging model. In addition, formulas containing I also work for the gradient-enhanced Kriging model, although this function is not included in the derivatives table of deriv. See derivModelMatrix for some examples of the trend specification.

data

a data.frame with named columns of n training points of dimension d. Note, all variables contained in data are used for the construction of the correlation matrix without and with derivatives.

deriv

an optional data.frame with the derivatives, whose columns should be named like those of data. If not specified, a Kriging model without derivatives is estimated.

covtype

a character to specify the covariance structure to be used. One of matern5_2, matern3_2 or gaussian. Default is matern5_2.

theta

a numeric vector of length d for the hyperparameters (optional). If not given, hyperparameters will be estimated via maximum likelihood.

tol

a tolerance for the conditional number of the correlation matrix, see blockChol for details. Default is NULL, i.e. no regularization is applied.

optimizer

an optional character that characterizes the optimization algorithms to be used for maximum likelihood estimation. See ‘Details’.

lower

an optional lower bound for the optimization of the correlation parameters.

upper

an optional upper bound for the optimization of the correlation parameters.

start

an optional vector of inital values for the optimization of the correlation parameters.

ncalls

an optional integer that defines the number of randomly selected initial values for the optimization.

control

a list of control parameters for the optimization routine. See optim or nmkb.

model

logical. Should the model frame be returned? Default is TRUE.

x

logical. Should the model matrix be returned? Default is FALSE.

y

logical. Should the response vector be returned? Default is FALSE.

dx

logical. Should the derivative of the model matrix be returned? Default is FALSE.

dy

logical. Should the derivatives of the response be returned? Default is FALSE.

digits

number of digits to be used for the print method.

...

further arguments, currently not used.

Details

Parameter estimation is performed via maximum likelihood. The optimizer argument can be used to select one of the optimization algorithms "NMBK" or "L-BFGS-B". In the case of the "L-BFGS-B", analytical gradients of the “concentrated” log likelihood are used. For one-dimensional problems, optimize is called and the algorithm selected via optimizer is ignored.

Value

gekm returns an object of class "gekm" whose underlying structure is a list containing at least the following components:

coefficients

the estimated regression coefficients.

sigma

the estimated process standard deviation.

theta

the (estimated) correlation parameters.

covtype

the name of the correlation function.

chol

(the components of) the upper triangular matrix of the Cholesky decomposition of the correlation matrix.

optimizer

the optimization algorithm.

convergence

the convergence code.

message

information from the optimizer.

logLik

the value of the negative “concentrated” log likelihood at the estimated parameters.

derivatives

TRUE if a gradient-enhanced Kriging model was adapted, otherwise FALSE.

data

the data.frame that was specified via the data argument.

deriv

if derivatives = TRUE, the data.frame with the derivatives that was specified via the deriv argument.

call

the matched call.

terms

the terms object used.

model

if requested (the default), the model frame used.

x

if requested, the model matrix.

y

if requested, the response vector.

dx

if requested, the derivatives of the model matrix.

dx

if requested, the vector of derivatives of the response.

Author(s)

Carmen van Meegen

References

Cressie, N. A. C. (1993). Statistics for Spartial Data. John Wiley & Sons. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/9781119115151")}.

Koehler, J. and Owen, A. (1996). Computer Experiments. In Ghosh, S. and Rao, C. (eds.), Design and Analysis of Experiments, volume 13 of Handbook of Statistics, pp. 261–308. Elsevier Science. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/S0169-7161(96)13011-X")}.

Krige, D. G. (1951). A Statistical Approach to Some Basic Mine Valuation Problems on the Witwatersrand. Journal of the Southern African Institute of Mining and Metallurgy, 52(6):199–139.

Laurent, L., Le Riche, R., Soulier, B., and Boucard, PA. (2019). An Overview of Gradient-Enhanced Metamodels with Applications. Archives of Computational Methods in Engineering, 26(1):61–106. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s11831-017-9226-3")}.

Martin, J. D. and Simpson, T. W. (2005). Use of Kriging Models to Approximate Deterministic Computer Models. AIAA Journal, 43(4):853–863. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2514/1.8650")}.

Morris, M., Mitchell, T., and Ylvisaker, D. (1993). Bayesian Design and Analysis of Computer Experiments: Use of Derivatives in Surface Prediction. Technometrics, 35(3):243–255. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00401706.1993.10485320")}.

Oakley, J. and O'Hagan, A. (2002). Bayesian Inference for the Uncertainty Distribution of Computer Model Outputs. Biometrika, 89(4):769–784. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/89.4.769")}.

O'Hagan, A., Kennedy, M. C., and Oakley, J. E. (1999). Uncertainty Analysis and Other Inference Tools for Complex Computer Codes. In Bayesian Statistics 6, Ed. J. M. Bernardo, J. O. Berger, A. P. Dawid and A .F. M. Smith, 503–524. Oxford University Press.

O'Hagan, A. (2006). Bayesian Analysis of Computer Code Outputs: A Tutorial. Reliability Engineering & System Safet, 91(10):1290–1300. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ress.2005.11.025")}.

Park, J.-S. and Beak, J. (2001). Efficient Computation of Maximum Likelihood Estimators in a Spatial Linear Model with Power Exponential Covariogram. Computers & Geosciences, 27(1):1–7. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/S0098-3004(00)00016-9")}.

Ranjan, P., Haynes, R. and Karsten, R. (2011). A Computationally Stable Approach to Gaussian Process Interpolation of Deterministic Computer Simulation Data. Technometrics, 53:366–378. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1198/TECH.2011.09141")}.

Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. The MIT Press. https://gaussianprocess.org/gpml/.

Ripley, B. D. (1981). Spatial Statistics. John Wiley & Sons. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/0471725218")}.

Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989). Design and Analysis of Computer Experiments. Statistical Science, 4(4):409–423. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/ss/1177012413")}.

Santner, T. J., Williams, B. J., and Notz, W. I. (2018). The Design and Analysis of Computer Experiments. 2nd edition. Springer-Verlag.

Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer-Verlag. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-1-4612-1494-6")}.

See Also

predict.gekm for prediction at new data points based on a model of class "gekm".

simulate.gekm for simulation of process paths conditional on a model of class "gekm".

Examples

## 1-dimensional example: Oakley and O’Hagan (2002)

# Define test function and its gradient
f <- function(x) 5 + x + cos(x)
fGrad <- function(x) 1 - sin(x)

# Generate coordinates and calculate slopes
x <- seq(-5, 5, length = 5)
y <- f(x)
dy <- fGrad(x)
dat <- data.frame(x, y)
deri <- data.frame(x = dy)

# Fit Kriging model
km.1d <- gekm(y ~ x, data = dat, covtype = "gaussian", theta = 1)
km.1d

# Fit Gradient-Enhanced Kriging model
gekm.1d <- gekm(y ~ x, data = dat, deriv = deri, covtype = "gaussian", theta = 1)
gekm.1d


## 2-dimensional example: Morris et al. (1993)

# List of inputs with their distributions and their respective ranges
inputs <- list("r_w" = list(dist = "norm", mean =  0.1, sd = 0.0161812, min = 0.05, max = 0.15),
	"r" = list(dist = "lnorm", meanlog = 7.71, sdlog = 1.0056, min = 100, max = 50000),
	"T_u" = list(dist = "unif", min = 63070, max = 115600),
	"H_u" = list(dist = "unif", min = 990, max = 1110),
	"T_l" = list(dist = "unif", min = 63.1, max = 116),
	"H_l" = list(dist = "unif", min = 700, max = 820),
	"L" = list(dist = "unif", min = 1120, max = 1680),
	# for a more nonlinear, nonadditive function, see Morris et al. (1993)
	"K_w" = list(dist = "unif", min = 1500, max = 15000))

# Generate design
design <- data.frame("r_w" = c(0, 0.268, 1),
	"r" = rep(0, 3),
	"T_u" = rep(0, 3),
	"H_u" = rep(0, 3),
	"T_l" = rep(0, 3),
	"H_l" = rep(0, 3),
	"L" = rep(0, 3),
	"K_w" = c(0, 1, 0.268))

# Function to transform design onto input range
transform <- function(x, data){
	for(p in names(data)){
		data[ , p] <- (x[[p]]$max - x[[p]]$min) * data[ , p] + x[[p]]$min
	}
	data
}

# Function to transform derivatives
deriv.transform <- function(x, data){
	for(p in colnames(data)){
		data[ , p] <- data[ , p] * (x[[p]]$max - x[[p]]$min)  
	}
	data
}

# Generate outcome and derivatives
design.trans <- transform(inputs, design)
design$y <- borehole(design.trans)
deri.trans <- boreholeGrad(design.trans)
deri <- data.frame(deriv.transform(inputs, deri.trans))

# Design and data
cbind(design[ , c("r_w", "K_w", "y")], deri[ , c("r_w", "K_w")])

# Fit gradient-enhanced Kriging model with Gaussian correlation function
mod <- gekm(y ~ 1, data = design[ , c("r_w", "K_w", "y")], 
	deriv = deri[ , c("r_w", "K_w")], covtype = "gaussian")
mod

## Compare results with Morris et al. (1993):

# Estimated hyperparameters
# in Morris et al. (1993): 0.429 and 0.467
1 / (2 * mod$theta^2)
# Estimated intercept
# in Morris et al. (1993): 69.15
coef(mod)
# Estimated standard deviation
# in Morris et al. (1993): 135.47
mod$sigma
# Posterior mean and standard deviation at (0.5, 0.5)
# in Morris et al. (1993): 69.4 and 2.7
predict(mod, data.frame("r_w" = 0.5, "K_w" = 0.5))
# Posterior mean and standard deviation at (1, 1)
# in Morris et al. (1993): 230.0 and 19.2
predict(mod, data.frame("r_w" = 1, "K_w" = 1))

## Graphical comparison: 

# Generate a 21 x 21 grid for prediction
n_grid <- 21
x <- seq(0, 1, length.out = n_grid)
grid <- expand.grid("r_w" = x, "K_w" = x)
pred <- predict(mod, grid)

# Compute ground truth on (transformed) grid
newdata <- data.frame("r_w" = grid[ , "r_w"], 
	"r" = 0, "T_u" = 0, "H_u" = 0, 
	"T_l" = 0, "H_l" = 0, "L" = 0,
	"K_w" = grid[ , "K_w"])
newdata <- transform(inputs, newdata)
truth <- borehole(newdata)

# Contour plots of predicted and actual output
par(mfrow = c(1, 2), oma = c(3.5, 3.5, 0, 0.2), mar = c(0, 0, 1.5, 0))
contour(x, x, matrix(pred$fit, nrow = n_grid, ncol = n_grid, byrow = TRUE),
	levels = c(seq(10, 50, 10), seq(100, 250, 50)),
	main = "Predicted output")
points(design[ , c("K_w", "r_w")], pch = 16)
contour(x, x, matrix(truth, nrow = n_grid, ncol = n_grid, byrow = TRUE), 
	levels = c(seq(10, 50, 10), seq(100, 250, 50)),
	yaxt = "n", main = "Ground truth")
points(design[ , c("K_w", "r_w")], pch = 16)
mtext(side = 1, outer = TRUE, line = 2.5, "Normalized hydraulic conductivity of borehole")
mtext(side = 2, outer = TRUE, line = 2.5, "Normalized radius of borehole")

gek documentation built on April 4, 2025, 12:35 a.m.

Related to gekm in gek...