predict.km: Predict values and confidence intervals at newdata for a km...

Description Usage Arguments Value Warning Author(s) References See Also Examples

View source: R/kmStruct.R

Description

Predicted values and (marginal of joint) conditional variances based on a km model. 95 % confidence intervals are given, based on strong assumptions: Gaussian process assumption, specific prior distribution on the trend parameters, known covariance parameters. This might be abusive in particular in the case where estimated covariance parameters are plugged in.

Usage

1
2
3
4
## S4 method for signature 'km'
predict(object, newdata, type, se.compute = TRUE, 
         cov.compute = FALSE, light.return = FALSE,
         bias.correct = FALSE, checkNames = TRUE, ...)

Arguments

object

an object of class km.

newdata

a vector, matrix or data frame containing the points where to perform predictions.

type

a character string corresponding to the kriging family, to be chosen between simple kriging ("SK"), or universal kriging ("UK").

se.compute

an optional boolean. If FALSE, only the kriging mean is computed. If TRUE, the kriging variance (actually, the corresponding standard deviation) and confidence intervals are computed too.

cov.compute

an optional boolean. If TRUE, the conditional covariance matrix is computed.

light.return

an optional boolean. If TRUE, c and Tinv.c are not returned. This should be reserved to expert users who want to save memory and know that they will not miss these values.

bias.correct

an optional boolean to correct bias in the UK variance and covariances. Default is FALSE. See Section Warning below.

checkNames

an optional boolean. If TRUE (default), a consistency test is performed between the names of newdata and the names of the experimental design (contained in object@X), see Section Warning below.

...

no other argument for this method.

Value

mean

kriging mean (including the trend) computed at newdata.

sd

kriging standard deviation computed at newdata. Not computed if se.compute=FALSE.

trend

the trend computed at newdata.

cov

kriging conditional covariance matrix. Not computed if cov.compute=FALSE (default).

lower95,
upper95

bounds of the 95 % confidence interval computed at newdata (to be interpreted with special care when parameters are estimated, see description above). Not computed if se.compute=FALSE.

c

an auxiliary matrix, containing all the covariances between newdata and the initial design points. Not returned if light.return=TRUE.

Tinv.c

an auxiliary vector, equal to inv(t(T))*c. Not returned if light.return=TRUE.

Warning

1. Contrarily to DiceKriging<=1.3.2, the estimated (UK) variance and covariances are NOT multiplied by n/(n-p) by default (n and p denoting the number of rows and columns of the design matrix F). Recall that this correction would contribute to limit bias: it would totally remove it if the correlation parameters were known (which is not the case here). However, this correction is often useless in the context of computer experiments, especially in adaptive strategies. It can be activated by turning bias.correct to TRUE, when type="UK".

2. The columns of newdata should correspond to the input variables, and only the input variables (nor the response is not admitted, neither external variables). If newdata contains variable names, and if checkNames is TRUE (default), then checkNames performs a complete consistency test with the names of the experimental design. Otherwise, it is assumed that its columns correspond to the same variables than the experimental design and in the same order.

Author(s)

O. Roustant, D. Ginsbourger, Ecole des Mines de St-Etienne.

References

N.A.C. Cressie (1993), Statistics for spatial data, Wiley series in probability and mathematical statistics.

A.G. Journel and C.J. Huijbregts (1978), Mining Geostatistics, Academic Press, London.

D.G. Krige (1951), A statistical approach to some basic mine valuation problems on the witwatersrand, J. of the Chem., Metal. and Mining Soc. of South Africa, 52 no. 6, 119-139.

J.D. Martin and T.W. Simpson (2005), Use of kriging models to approximate deterministic computer models, AIAA Journal, 43 no. 4, 853-863.

G. Matheron (1963), Principles of geostatistics, Economic Geology, 58, 1246-1266.

G. Matheron (1969), Le krigeage universel, Les Cahiers du Centre de Morphologie Mathematique de Fontainebleau, 1.

J.-S. Park and J. Baek (2001), Efficient computation of maximum likelihood estimators in a spatial linear model with power exponential covariogram, Computer Geosciences, 27 no. 1, 1-7.

C.E. Rasmussen and C.K.I. Williams (2006), Gaussian Processes for Machine Learning, the MIT Press, http://www.gaussianprocess.org/gpml/

J. Sacks, W.J. Welch, T.J. Mitchell, and H.P. Wynn (1989), Design and analysis of computer experiments, Statistical Science, 4, 409-435.

See Also

km, plot,km-method

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# ------------
# a 1D example
# ------------

x <- c(0, 0.4, 0.6, 0.8, 1)
y <- c(-0.3, 0, 0, 0.5, 0.9)

formula <- y~x   # try also   y~1  and  y~x+I(x^2)

model <- km(formula=formula, design=data.frame(x=x), response=data.frame(y=y), 
            covtype="matern5_2")

tmin <- -0.5; tmax <- 2.5
t <- seq(from=tmin, to=tmax, by=0.005)
color <- list(SK="black", UK="blue")

# Results with Universal Kriging formulae (mean and and 95% intervals)
p.UK <- predict(model, newdata=data.frame(x=t), type="UK")
plot(t, p.UK$mean, type="l", ylim=c(min(p.UK$lower95),max(p.UK$upper95)), 
                xlab="x", ylab="y")
lines(t, p.UK$trend, col="violet", lty=2)
lines(t, p.UK$lower95, col=color$UK, lty=2)
lines(t, p.UK$upper95, col=color$UK, lty=2)
points(x, y, col="red", pch=19)
abline(h=0)

# Results with Simple Kriging (SK) formula. The difference between the width of
# SK and UK intervals are due to the estimation error of the trend parameters 
# (but not to the range parameters, not taken into account in the UK formulae).
p.SK <- predict(model, newdata=data.frame(x=t), type="SK")
lines(t, p.SK$mean, type="l", ylim=c(-7,7), xlab="x", ylab="y")
lines(t, p.SK$lower95, col=color$SK, lty=2)
lines(t, p.SK$upper95, col=color$SK, lty=2)
points(x, y, col="red", pch=19)
abline(h=0)

legend.text <- c("Universal Kriging (UK)", "Simple Kriging (SK)")
legend(x=tmin, y=max(p.UK$upper), legend=legend.text, 
       text.col=c(color$UK, color$SK), col=c(color$UK, color$SK), 
       lty=3, bg="white")


# ---------------------------------------------------------------------------------
# a 1D example (following)- COMPARISON with the PREDICTION INTERVALS for REGRESSION
# ---------------------------------------------------------------------------------
# There are two interesting cases: 
# *  When the range parameter is near 0 ; Then the intervals should be nearly 
#    the same for universal kriging (when bias.correct=TRUE, see above) as for regression. 
#    This is because the uncertainty around the range parameter is not taken into account 
#    in the Universal Kriging formula.
# *  Where the predicted sites are "far" (relatively to the spatial correlation) 
#    from the design points ; in this case, the kriging intervals are not equal 
#    but nearly proportional to the regression ones, since the variance estimate 
#    for regression is not the same than for kriging (that depends on the 
#    range estimate)

x <- c(0, 0.4, 0.6, 0.8, 1)
y <- c(-0.3, 0, 0, 0.5, 0.9)

formula <- y~x   # try also   y~1  and  y~x+I(x^2)
upper <- 0.05    # this is to get something near to the regression case. 
                 # Try also upper=1 (or larger) to get usual results.

model <- km(formula=formula, design=data.frame(x=x), response=data.frame(y=y), 
               covtype="matern5_2", upper=upper)

tmin <- -0.5; tmax <- 2.5
t <- seq(from=tmin, to=tmax, by=0.005)
color <- list(SK="black", UK="blue", REG="red")

# Results with Universal Kriging formulae (mean and and 95% intervals)
p.UK <- predict(model, newdata=data.frame(x=t), type="UK", bias.correct=TRUE)
plot(t, p.UK$mean, type="l", ylim=c(min(p.UK$lower95),max(p.UK$upper95)), 
                   xlab="x", ylab="y")
lines(t, p.UK$trend, col="violet", lty=2)
lines(t, p.UK$lower95, col=color$UK, lty=2)
lines(t, p.UK$upper95, col=color$UK, lty=2)
points(x, y, col="red", pch=19)
abline(h=0)

# Results with Simple Kriging (SK) formula. The difference between the width of
# SK and UK intervals are due to the estimation error of the trend parameters 
# (but not to the range parameters, not taken into account in the UK formulae).
p.SK <- predict(model, newdata=data.frame(x=t), type="SK")
lines(t, p.SK$mean, type="l", ylim=c(-7,7), xlab="x", ylab="y")
lines(t, p.SK$lower95, col=color$SK, lty=2)
lines(t, p.SK$upper95, col=color$SK, lty=2)
points(x, y, col="red", pch=19)
abline(h=0)

# results with regression given by lm (package stats)
m.REG <- lm(formula)
p.REG <- predict(m.REG, newdata=data.frame(x=t), interval="prediction")
lines(t, p.REG[,1], col=color$REG)
lines(t, p.REG[,2], col=color$REG, lty=2)
lines(t, p.REG[,3], col=color$REG, lty=2)

legend.text <- c("UK with bias.correct=TRUE", "SK", "Regression")
legend(x=tmin, y=max(p.UK$upper), legend=legend.text, 
       text.col=c(color$UK, color$SK, color$REG), 
       col=c(color$UK, color$SK, color$REG), lty=3, bg="white")


# ----------------------------------
# A 2D example - Branin-Hoo function
# ----------------------------------

# a 16-points factorial design, and the corresponding response
d <- 2; n <- 16
fact.design <- expand.grid(x1=seq(0,1,length=4), x2=seq(0,1,length=4))
branin.resp <- apply(fact.design, 1, branin)

# kriging model 1 : gaussian covariance structure, no trend, 
#                   no nugget effect
m1 <- km(~1, design=fact.design, response=branin.resp, covtype="gauss")

# predicting at testdata points
testdata <- expand.grid(x1=s <- seq(0,1, length=15), x2=s)
predicted.values.model1 <- predict(m1, testdata, "UK")

Example output

optimisation start
------------------
* estimation method   : MLE 
* optimisation method : BFGS 
* analytical gradient : used
* trend model : ~x
* covariance model : 
  - type :  matern5_2 
  - nugget : NO
  - parameters lower bounds :  1e-10 
  - parameters upper bounds :  2 
  - best initial criterion value(s) :  2.335706 

N = 1, M = 5 machine precision = 2.22045e-16
At X0, 0 variables are exactly at the bounds
At iterate     0  f=      -2.3357  |proj g|=       1.2436
At iterate     1  f =      -2.3508  |proj g|=       0.14773
At iterate     2  f =      -2.3561  |proj g|=       0.23464
At iterate     3  f =      -2.3566  |proj g|=      0.020084
At iterate     4  f =      -2.3566  |proj g|=    0.00064363
At iterate     5  f =      -2.3566  |proj g|=    1.5847e-06

iterations 5
function evaluations 7
segments explored during Cauchy searches 5
BFGS updates skipped 0
active bounds at final generalized Cauchy point 0
norm of the final projected gradient 1.58468e-06
final function value -2.3566

F = -2.3566
final  value -2.356598 
converged

optimisation start
------------------
* estimation method   : MLE 
* optimisation method : BFGS 
* analytical gradient : used
* trend model : ~x
* covariance model : 
  - type :  matern5_2 
  - nugget : NO
  - parameters lower bounds :  1e-10 
  - parameters upper bounds :  0.05 
  - best initial criterion value(s) :  2.27932 

N = 1, M = 5 machine precision = 2.22045e-16
At X0, 0 variables are exactly at the bounds
At iterate     0  f=      -2.2793  |proj g|=    0.0016952
At iterate     1  f =      -2.2797  |proj g|=             0

iterations 1
function evaluations 2
segments explored during Cauchy searches 1
BFGS updates skipped 0
active bounds at final generalized Cauchy point 1
norm of the final projected gradient 0
final function value -2.27974

F = -2.27974
final  value -2.279738 
converged

optimisation start
------------------
* estimation method   : MLE 
* optimisation method : BFGS 
* analytical gradient : used
* trend model : ~1
* covariance model : 
  - type :  gauss 
  - nugget : NO
  - parameters lower bounds :  1e-10 1e-10 
  - parameters upper bounds :  2 2 
  - best initial criterion value(s) :  -79.31904 

N = 2, M = 5 machine precision = 2.22045e-16
At X0, 0 variables are exactly at the bounds
At iterate     0  f=       79.319  |proj g|=      0.76831
At iterate     1  f =       78.317  |proj g|=       0.66779
At iterate     2  f =       77.475  |proj g|=        1.4253
At iterate     3  f =       76.744  |proj g|=         1.398
At iterate     4  f =       76.286  |proj g|=       0.73566
At iterate     5  f =       76.271  |proj g|=       0.20969
At iterate     6  f =        76.27  |proj g|=      0.012133
At iterate     7  f =        76.27  |proj g|=    0.00012433
At iterate     8  f =        76.27  |proj g|=    3.6924e-08

iterations 8
function evaluations 10
segments explored during Cauchy searches 9
BFGS updates skipped 0
active bounds at final generalized Cauchy point 1
norm of the final projected gradient 3.69244e-08
final function value 76.2701

F = 76.2701
final  value 76.270143 
converged

DiceKriging documentation built on Feb. 24, 2021, 1:07 a.m.