| mle-methods | R Documentation |
Maximum Likelihood estimation of Gaussian Process models which covariance structure is described in a covariance kernel object.
## S4 method for signature 'covAll'
mle(object,
y, X, F = NULL, beta = NULL,
parCovIni = coef(object),
parCovLower = coefLower(object),
parCovUpper = coefUpper(object),
noise = TRUE, varNoiseIni = var(y) / 10,
varNoiseLower = 0, varNoiseUpper = Inf,
compGrad = hasGrad(object),
doOptim = TRUE,
optimFun = c("nloptr::nloptr", "stats::optim"),
optimMethod = ifelse(compGrad, "NLOPT_LD_LBFGS", "NLOPT_LN_COBYLA"),
optimCode = NULL,
multistart = 1,
parTrack = FALSE, trace = 0, checkNames = TRUE,
...)
object |
An object representing a covariance kernel. |
y |
Response vector. |
X |
Spatial (or input) design matrix. |
F |
Trend matrix. |
beta |
Vector of trend coefficients if known/fixed. |
parCovIni |
Vector with named elements or matrix giving the initial values for the
parameters. See Examples. When this argument is omitted, the
vector of covariance parameters given in |
parCovLower |
Lower bounds for the parameters. When this argument is omitted, the
vector of parameters lower bounds in the covariance given in
|
parCovUpper |
Upper bounds for the parameters. When this argument is omitted, the
vector of parameters lower bounds in the covariance given in
|
noise |
Logical. Should a noise be added to the error term? |
varNoiseIni |
Initial value for the noise variance. |
varNoiseLower |
Lower bound for the noise variance. Should be |
varNoiseUpper |
Upper bound for the noise variance. Should be |
compGrad |
Logical: compute and use the analytical gradient in optimization?
This is only possible when |
doOptim |
Logical. If |
optimFun |
Function used for optimization. The two pre-defined choices are
|
optimMethod |
Name of the optimization method or algorithm. This is passed as the
|
optimCode |
An object with class |
multistart |
Number of optimizations to perform from different starting points
(see |
parTrack |
If |
trace |
Integer level of verbosity. |
checkNames |
if |
... |
Further arguments to be passed to the optimization function,
|
The choice of optimization method is as follows.
When optimFun is nloptr:nloptr, it is assumed
that we are minimizing the negative log-likelihood - \log
L. Note that both predefined methods
"NLOPT_LD_LBFGS" and "NLOPT_LN_COBYLA" can cope with a
non-finite value of the objective, except for the initial value of
the parameter. Non-finite values of - \log L are
often met in practice during optimization steps. The method
"NLOPT_LD_LBFGS" used when compGrad is TRUE
requires that the gradient is provided by/with the covariance
object. You can try other values of optimMethod corresponding
to the possible choice of the "algorithm" element in the
opts argument of nloptr:nloptr. It may be useful to
give other options in order to control the optimization and its
stopping rule.
When optimFun is "stats:optim", it is assumed
that we are maximizing the log-likelihood \log L. We
suggest to use one of the methods "L-BFGS-B" or
"BFGS". Notice that control can be provided in
..., but control$fnscale is forced to be - 1,
corresponding to maximization. Note that "L-BFGS-B" uses box
constraints, but the optimization stops as soon as the
log-likelihood is non-finite or NA. The method "BFGS"
does not use constraints but allows the log-likelihood to be
non-finite or NA. Both methods can be used without gradient
or with gradient if object allows this.
The vectors parCovIni, parCovLower, parCovUpper
must have elements corresponding to those of the vector of kernel
parameters given by coef(object). These vectors should have
suitably named elements.
A list with elements hopefully having understandable names.
opt |
List of optimization results if it was successful, or an error object otherwise. |
coef.kernel |
The vector of 'kernel' coefficients. This will include one or several variance parameters. |
coef.trend |
Estimate of the vector |
parTracked |
A matrix with rows giving the successive iterates during
optimization if the |
The checks concerning the parameter names, dimensions of provided objects, ... are not fully implemented yet.
Using the optimCode possibility requires a bit of programming
effort, although a typical code only contains a few lines.
Y. Deville, O. Roustant
gp for various examples, optimMethods
to see the possible values of the argument optimMethod.
set.seed(29770)
##=======================================================================
## Example. A 4-dimensional "covMan" kernel
##=======================================================================
d <- 4
myCovMan <-
covMan(
kernel = function(x1, x2, par) {
htilde <- (x1 - x2) / par[1]
SS2 <- sum(htilde^2)
d2 <- exp(-SS2)
kern <- par[2] * d2
d1 <- 2 * kern * SS2 / par[1]
attr(kern, "gradient") <- c(theta = d1, sigma2 = d2)
return(kern)
},
label = "myGauss",
hasGrad = TRUE,
d = 4,
parLower = c(theta = 0.0, sigma2 = 0.0),
parUpper = c(theta = +Inf, sigma2 = +Inf),
parNames = c("theta", "sigma2"),
par = c(NA, NA)
)
kernCode <- "
SEXP kern, dkern;
int nprotect = 0, d;
double SS2 = 0.0, d2, z, *rkern, *rdkern;
d = LENGTH(x1);
PROTECT(kern = allocVector(REALSXP, 1)); nprotect++;
PROTECT(dkern = allocVector(REALSXP, 2)); nprotect++;
rkern = REAL(kern);
rdkern = REAL(dkern);
for (int i = 0; i < d; i++) {
z = ( REAL(x1)[i] - REAL(x2)[i] ) / REAL(par)[0];
SS2 += z * z;
}
d2 = exp(-SS2);
rkern[0] = REAL(par)[1] * d2;
rdkern[1] = d2;
rdkern[0] = 2 * rkern[0] * SS2 / REAL(par)[0];
SET_ATTR(kern, install(\"gradient\"), dkern);
UNPROTECT(nprotect);
return kern;
"
## inline the C function into an R function: MUCH MORE EFFICIENT!!!
## Not run:
require(inline)
kernC <- cfunction(sig = signature(x1 = "numeric", x2 = "numeric",
par = "numeric"),
body = kernCode)
myCovMan <- covMan(kernel = kernC, hasGrad = TRUE, label = "myGauss", d = 4,
parNames = c("theta", "sigma2"),
parLower = c("theta" = 0.0, "sigma2" = 0.0),
parUpper = c("theta" = Inf, "sigma2" = Inf))
## End(Not run)
##=======================================================================
## Example (continued). Simulate data for covMan and trend
##=======================================================================
n <- 100;
X <- matrix(runif(n * d), nrow = n)
colnames(X) <- inputNames(myCovMan)
coef(myCovMan) <- myPar <- c(theta = 0.5, sigma2 = 2)
C <- covMat(object = myCovMan, X = X,
compGrad = FALSE, index = 1L)
library(MASS)
set.seed(456)
y <- mvrnorm(mu = rep(0, n), Sigma = C)
p <- rpois(1, lambda = 4)
if (p > 0) {
F <- matrix(runif(n * p), nrow = n, ncol = p)
beta <- rnorm(p)
y <- F %*% beta + y
} else F <- NULL
par <- parCovIni <- c("theta" = 0.6, "sigma2" = 4)
##=======================================================================
## Example (continued). ML estimation. Note the 'partrack' argument
##=======================================================================
est <- mle(object = myCovMan,
parCovIni = parCovIni,
y = y, X = X, F = F,
parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est$opt$value
## change the (constrained) optimization method
## Not run:
est1 <- mle(object = myCovMan,
parCovIni = parCovIni,
optimFun = "stats::optim",
optimMethod = "L-BFGS-B",
y = y, X = X, F = F,
parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est1$opt$value
## End(Not run)
##=======================================================================
## Example (continued). Grid for graphical analysis
##=======================================================================
## Not run:
theta.grid <- seq(from = 0.1, to = 0.7, by = 0.2)
sigma2.grid <- seq(from = 0.3, to = 6, by = 0.4)
par.grid <- expand.grid(theta = theta.grid, sigma2 = sigma2.grid)
ll <- apply(as.matrix(par.grid), 1, est$logLikFun)
llmat <- matrix(ll, nrow = length(theta.grid),
ncol = length(sigma2.grid))
## End(Not run)
##=======================================================================
## Example (continued). Explore the surface ?
##=======================================================================
## Not run:
require(rgl)
persp3d(x = theta.grid, y = sigma2.grid, z = ll,
xlab = "theta", ylab = "sigma2", zlab = "logLik",
col = "SpringGreen3", alpha = 0.6)
## End(Not run)
##=======================================================================
## Example (continued). Draw a contour plot for the log-lik
## and show iterates
##=======================================================================
## Not run:
contour(x = theta.grid, y = sigma2.grid, z = llmat,
col = "SpringGreen3", xlab = "theta", ylab = "sigma2",
main = "log-likelihood contours and iterates",
xlim = range(theta.grid, est$parTracked[ , 1], na.rm = TRUE),
ylim = range(sigma2.grid, est$parTracked[ , 2], na.rm = TRUE))
abline(v = est$coef.kernel[1], h = est$coef.kernel[2], lty = "dotted")
niter <- nrow(est$parTracked)
points(est$parTracked[1:niter-1, ],
col = "orangered", bg = "yellow", pch = 21, lwd = 2, type = "o")
points(est$parTracked[niter, , drop = FALSE],
col = "blue", bg = "blue", pch = 21, lwd = 2, type = "o", cex = 1.5)
ann <- seq(from = 1, to = niter, by = 5)
text(x = est$parTracked[ann, 1], y = est$parTracked[ann, 2],
labels = ann - 1L, pos = 4, cex = 0.8, col = "orangered")
points(x = myPar["theta"], y = myPar["sigma2"],
bg = "Chartreuse3", col = "ForestGreen",
pch = 22, lwd = 2, cex = 1.4)
legend("topright", legend = c("optim", "optim (last)", "true"),
pch = c(21, 21, 22), lwd = c(2, 2, 2), lty = c(1, 1, NA),
col = c("orangered", "blue", "ForestGreen"),
pt.bg = c("yellow", "blue", "Chartreuse3"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.