MLE-methods: Maximum Likelihood Estimation of Gaussian Process Model...
In kergp: Gaussian Process Laboratory

mle-methods

R Documentation

Maximum Likelihood Estimation of Gaussian Process Model Parameters

Description

Maximum Likelihood estimation of Gaussian Process models which covariance structure is described in a covariance kernel object.

Usage


## S4 method for signature 'covAll'
mle(object, 
    y, X, F = NULL, beta = NULL,
    parCovIni = coef(object),
    parCovLower = coefLower(object), 
    parCovUpper = coefUpper(object),
    noise = TRUE, varNoiseIni = var(y) / 10,
    varNoiseLower = 0, varNoiseUpper = Inf,
    compGrad = hasGrad(object),
    doOptim = TRUE,
    optimFun = c("nloptr::nloptr", "stats::optim"),
    optimMethod = ifelse(compGrad, "NLOPT_LD_LBFGS", "NLOPT_LN_COBYLA"),
    optimCode = NULL,
    multistart = 1,
    parTrack = FALSE, trace  = 0, checkNames = TRUE,
    ...)

Arguments

`object`	An object representing a covariance kernel.
`y`	Response vector.
`X`	Spatial (or input) design matrix.
`F`	Trend matrix.
`beta`	Vector of trend coefficients if known/fixed.
`parCovIni`	Vector with named elements or matrix giving the initial values for the parameters. See Examples. When this argument is omitted, the vector of covariance parameters given in `object` is used if `multistart == 1`; If `multistart > 1`, a matrix of parameters is simulated by using `simulPar`. Remind that you can use the `coef` and `coef<-` methods to get and set this slot of the covariance object.
`parCovLower`	Lower bounds for the parameters. When this argument is omitted, the vector of parameters lower bounds in the covariance given in `object` is used. You can use `coefLower` and `coefLower<-` methods to get and set this slot of the covariance object.
`parCovUpper`	Upper bounds for the parameters. When this argument is omitted, the vector of parameters lower bounds in the covariance given in `object` is used. You can use `coefUpper` and `coefUpper<-` methods to get and set this slot of the covariance object.
`noise`	Logical. Should a noise be added to the error term?
`varNoiseIni`	Initial value for the noise variance.
`varNoiseLower`	Lower bound for the noise variance. Should be `<= varNoiseIni`.
`varNoiseUpper`	Upper bound for the noise variance. Should be `>= varNoiseIni`.
`compGrad`	Logical: compute and use the analytical gradient in optimization? This is only possible when `object` provides the analytical gradient.
`doOptim`	Logical. If `FALSE` no optimization is done.
`optimFun`	Function used for optimization. The two pre-defined choices are `nloptr::nloptr` (default) and `stats::optim`, both in combination with a few specific optimization methods. Ignored if `optimCode` is provided.
`optimMethod`	Name of the optimization method or algorithm. This is passed as the `"algorithm"` element of the `opts` argument when `nloptr::nloptr` is used (default), or to the `method` argument when `stats::optim` is used. When another value of `optimFun` is given, the value of `optimMethod` is ignored. Ignored if `optimCode` is provided. Use `optimMethods` to obtain the list of usable values.
`optimCode`	An object with class `"expression"` or `"character"` representing a user-written R code to be parsed and performing the log-likelihood maximization. Notice that this argument will bypass `optimFun` and `optimMethod`. The expression must define an object named `"opt"`, which is either a list containing optimization results, either an object inheriting from `"try-error"` to cope with the case where a problem occurred during the optimization.
`multistart`	Number of optimizations to perform from different starting points (see `parCovIni`). Parallel backend is encouraged.
`parTrack`	If `TRUE`, the parameter vectors used during the optimization are tracked and returned as a matrix.
`trace`	Integer level of verbosity.
`checkNames`	if `TRUE` (default), check the compatibility of `X` with `object`, see `checkX`.
`...`	Further arguments to be passed to the optimization function, `nloptr` or `optim`.

Details

The choice of optimization method is as follows.

When optimFun is nloptr:nloptr, it is assumed that we are minimizing the negative log-likelihood - \log L. Note that both predefined methods "NLOPT_LD_LBFGS" and "NLOPT_LN_COBYLA" can cope with a non-finite value of the objective, except for the initial value of the parameter. Non-finite values of - \log L are often met in practice during optimization steps. The method "NLOPT_LD_LBFGS" used when compGrad is TRUE requires that the gradient is provided by/with the covariance object. You can try other values of optimMethod corresponding to the possible choice of the "algorithm" element in the opts argument of nloptr:nloptr. It may be useful to give other options in order to control the optimization and its stopping rule.
When optimFun is "stats:optim", it is assumed that we are maximizing the log-likelihood \log L. We suggest to use one of the methods "L-BFGS-B" or "BFGS". Notice that control can be provided in ..., but control$fnscale is forced to be - 1, corresponding to maximization. Note that "L-BFGS-B" uses box constraints, but the optimization stops as soon as the log-likelihood is non-finite or NA. The method "BFGS" does not use constraints but allows the log-likelihood to be non-finite or NA. Both methods can be used without gradient or with gradient if object allows this.

The vectors parCovIni, parCovLower, parCovUpper must have elements corresponding to those of the vector of kernel parameters given by coef(object). These vectors should have suitably named elements.

Value

A list with elements hopefully having understandable names.

`opt`	List of optimization results if it was successful, or an error object otherwise.
`coef.kernel`	The vector of 'kernel' coefficients. This will include one or several variance parameters.
`coef.trend`	Estimate of the vector `\boldsymbol{\beta}` of the trend coefficients.
`parTracked`	A matrix with rows giving the successive iterates during optimization if the `parTrack` argument was set to `TRUE`.

Note

The checks concerning the parameter names, dimensions of provided objects, ... are not fully implemented yet.

Using the optimCode possibility requires a bit of programming effort, although a typical code only contains a few lines.

Author(s)

Y. Deville, O. Roustant

Examples


set.seed(29770)

##=======================================================================
## Example. A 4-dimensional "covMan" kernel
##=======================================================================
d <- 4
myCovMan <- 
      covMan(
         kernel = function(x1, x2, par) { 
         htilde <- (x1 - x2) / par[1]
         SS2 <- sum(htilde^2)
         d2 <- exp(-SS2)
         kern <- par[2] * d2
         d1 <- 2 * kern * SS2 / par[1]            
         attr(kern, "gradient") <- c(theta = d1,  sigma2 = d2)
         return(kern)
      },
      label = "myGauss",
      hasGrad = TRUE,
      d = 4,    
      parLower = c(theta = 0.0, sigma2 = 0.0),
      parUpper = c(theta = +Inf, sigma2 = +Inf),
      parNames = c("theta", "sigma2"),
      par = c(NA, NA)
      )
kernCode <- "
       SEXP kern, dkern;
       int nprotect = 0, d;
       double SS2 = 0.0, d2, z, *rkern, *rdkern;

       d = LENGTH(x1);
       PROTECT(kern = allocVector(REALSXP, 1)); nprotect++;
       PROTECT(dkern = allocVector(REALSXP, 2)); nprotect++;
       rkern = REAL(kern);
       rdkern = REAL(dkern);

       for (int i = 0; i < d; i++) {
         z = ( REAL(x1)[i] - REAL(x2)[i] ) / REAL(par)[0];
         SS2 += z * z; 
       }

       d2 = exp(-SS2);
       rkern[0] = REAL(par)[1] * d2;
       rdkern[1] =  d2; 
       rdkern[0] =  2 * rkern[0] * SS2 / REAL(par)[0];

       SET_ATTR(kern, install(\"gradient\"), dkern);
       UNPROTECT(nprotect);
       return kern;
     "

## inline the C function into an R function: MUCH MORE EFFICIENT!!!
## Not run: 
require(inline)
kernC <- cfunction(sig = signature(x1 = "numeric", x2 = "numeric",
                                   par = "numeric"),
                    body = kernCode)
myCovMan <- covMan(kernel = kernC, hasGrad = TRUE, label = "myGauss", d = 4,
                   parNames = c("theta", "sigma2"),
                   parLower = c("theta" = 0.0, "sigma2" = 0.0),
                   parUpper = c("theta" = Inf, "sigma2" = Inf))

## End(Not run)

##=======================================================================
## Example (continued). Simulate data for covMan and trend
##=======================================================================
n <- 100; 
X <- matrix(runif(n * d), nrow = n)
colnames(X) <- inputNames(myCovMan)

coef(myCovMan) <- myPar <- c(theta = 0.5, sigma2 = 2)
C <- covMat(object = myCovMan, X = X,
            compGrad = FALSE,  index = 1L)

library(MASS)
set.seed(456)
y <- mvrnorm(mu = rep(0, n), Sigma = C)
p <- rpois(1, lambda = 4)
if (p > 0) {
  F <- matrix(runif(n * p), nrow = n, ncol = p)
  beta <- rnorm(p)
  y <- F %*% beta + y
} else F <- NULL
par <- parCovIni <- c("theta" = 0.6, "sigma2" = 4)

##=======================================================================
## Example (continued). ML estimation. Note the 'partrack' argument
##=======================================================================           
est <- mle(object = myCovMan,
           parCovIni = parCovIni,
           y = y, X = X, F = F,
           parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
           parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est$opt$value

## change the (constrained) optimization  method
## Not run: 
est1 <- mle(object = myCovMan,
            parCovIni = parCovIni,
            optimFun = "stats::optim",
            optimMethod = "L-BFGS-B",
            y = y, X = X, F = F,
            parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
            parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est1$opt$value

## End(Not run)

##=======================================================================
## Example (continued). Grid for graphical analysis
##=======================================================================
## Not run: 
    theta.grid <- seq(from = 0.1, to = 0.7, by = 0.2)
    sigma2.grid <- seq(from = 0.3, to = 6, by = 0.4)
    par.grid <- expand.grid(theta = theta.grid, sigma2 = sigma2.grid)
    ll <- apply(as.matrix(par.grid), 1, est$logLikFun)
    llmat <- matrix(ll, nrow = length(theta.grid),
                    ncol = length(sigma2.grid))

## End(Not run)                

##=======================================================================
## Example (continued). Explore the surface ?
##=======================================================================
## Not run: 
   require(rgl)
   persp3d(x = theta.grid, y = sigma2.grid, z = ll,
           xlab = "theta", ylab = "sigma2", zlab = "logLik",
           col = "SpringGreen3", alpha = 0.6)

## End(Not run)

##=======================================================================
## Example (continued). Draw a contour plot for the log-lik 
##                        and show iterates
##=======================================================================
## Not run: 
    contour(x = theta.grid, y = sigma2.grid, z = llmat,
            col = "SpringGreen3", xlab = "theta", ylab = "sigma2",
            main = "log-likelihood contours and iterates",
            xlim = range(theta.grid, est$parTracked[ , 1], na.rm = TRUE),
            ylim = range(sigma2.grid, est$parTracked[ , 2], na.rm = TRUE))
    abline(v = est$coef.kernel[1], h = est$coef.kernel[2], lty = "dotted")
    niter <- nrow(est$parTracked)
    points(est$parTracked[1:niter-1, ],
           col = "orangered", bg = "yellow", pch = 21, lwd = 2, type = "o")
    points(est$parTracked[niter, , drop = FALSE],
           col = "blue", bg = "blue", pch = 21, lwd = 2, type = "o", cex = 1.5)
    ann <- seq(from = 1, to = niter, by = 5)
    text(x = est$parTracked[ann, 1], y = est$parTracked[ann, 2],
         labels = ann - 1L, pos = 4, cex = 0.8, col = "orangered")
    points(x = myPar["theta"], y = myPar["sigma2"],
           bg = "Chartreuse3", col = "ForestGreen",
           pch = 22, lwd = 2, cex = 1.4)

    legend("topright", legend = c("optim", "optim (last)", "true"),
           pch = c(21, 21, 22), lwd = c(2, 2, 2), lty = c(1, 1, NA),
           col = c("orangered", "blue", "ForestGreen"),
           pt.bg = c("yellow", "blue", "Chartreuse3"))
 

## End(Not run)

kergp documentation built on April 4, 2025, 2:29 a.m.