gpcm: Fit and/or create a Gaussian Process Classification (GPC)...
In GPCsign: Gaussian Process Classification as Described in Bachoc et al. (2020)

gpcm	R Documentation

Fit and/or create a Gaussian Process Classification (GPC) model

Description

gpcm is used to fit GPC models. When parameters are known, the function creates a model using given parameters. Otherwise, they are estimated by Maximum Likelihood. In both cases, the result is a gpcm object.

Usage

gpcm(f, Xf, covtype = "matern5_2", noise.var = 1e-6,
     coef.cov = NULL, coef.m = NULL, multistart = 1,
     seed = NULL, lower = NULL, upper = NULL, nsimu = 100,
     normalize = TRUE, X.mean = NULL, X.std = NULL, MeanTransform = NULL)

Arguments

`f`	a vector containing the binary observations (+/-1) corresponding to the class labels.
`Xf`	a matrix representing the design of experiments.
`covtype`	a character string specifying the covariance structure for the latent GP. Default is `matern5_2`.
`noise.var`	variance value standing for the homogeneous nugget effect. Default is 1e-6.
`coef.cov`	an optional vector containing the values for covariance parameters for the latent GP. (See below).
`coef.m`	an optional scalar corresponding to the mean value of the latent GP. If both `coef.cov` and `coef.m` are provided, no covariance parameter estimation is performed. If at least one of them is missing, both are estimated.
`multistart`	an optional integer indicating the number of initial points from which running the `BFGS` for covariance parameter optimization. The multiple optimizations will be performed in parallel provided that a parallel backend is registered (see package future)
`seed`	to fix the seed, default is `NULL`.
`lower`	(see below).
`upper`	`lower`, `upper`: bounds for the covariance parameters (scalars or vectors), if `NULL` they are set to 0.2 and 3, respectively.
`nsimu`	the number of samples of the latent process at observation points `Xf` to generate. Must be a non-null integer.
`normalize`	a logical parameter indicating whether to normalize the input matrix `Xf`. If `TRUE`, the matrix will be normalized using `X.mean` and `X.std` values if given; otherwise, the mean and standard deviation are computed and used for normalization.
`X.mean`	(see below).
`X.std`	optional vectors containing mean and standard deviation values for each column of the input matrix. If they are not provided, they are computed from the input matrix `Xf`.
`MeanTransform`	optional character string specifying a transform of the latent process mean coef.m. If `positive` (resp. negative), coef.m is constrained to be positive (resp. negative) by an exponential transform.

Details

The generation of the matrix of samples of the latent process Z_obs is done using Gibbs sampling. See rtmvnorm function in tmvtnorm package.

Value

An object of class gpcm. See gpcm-class.

Author(s)

Morgane MENZ, Céline HELBERT, Victor PICHENY, François BACHOC. Contributors: Naoual SERRAJI.

References

Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s10898-020-00920-0")}.

Kotecha, J. H., Djuric, P. M. (1999). Gibbs Sampling Approach For Generation of Truncated Multivariate Gaussian Random Variables. IEEE Computer Society, 1757–1760.

Wilhelm, S. tmvtnorm: Truncated Multivariate Normal and Student t Distribution. R package version 1.6. https://CRAN.R-project.org/package=tmvtnorm.

Roustant, O., Ginsbourger, D. & Deville, Y. Contributors: Chevalier, C. , Richet, Y. DiceKriging: Kriging Methods for Computer Experiments. R package version 1.6.0. https://CRAN.R-project.org/package=DiceKriging.

Byrd, R. H., Lu, P., Nocedal, J. and Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16, 1190–1208. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1137/0916069")}.

Examples

# ----------------------------------
# A 1D example - sinusoidal function
# ----------------------------------
sinusoidal_function <- function(x) {
  sin(4 * pi * x)
}

# Design of Experiments Xf and the corresponding signs f
Xf <- as.matrix(c(0.07, 0.19, 0.42, 0.56, 0.81, 0.90))
f <- rep(1, length(Xf))
f[(sinusoidal_function(Xf) < 0)] <- -1

# GPC model
GPCmodel <- gpcm(f, Xf, multistart = 3)

# Graphics of predictions
x <- as.matrix(seq(0, 1, length.out = 101))
result <- predict(object = GPCmodel, newdata = x)
probabilities <- result$prob
index <- match(Xf, x)
plot(x, probabilities, pch = "-")
points(Xf[f == 1], probabilities[index[f == 1]], pch = 20, col = "blue")
points(Xf[f == -1], probabilities[index[f == -1]], pch = 20, col = "red")
abline(h = 0.5, lty = 2)
legend("topright",title = "DoE Xf",title.cex = 0.7, legend = c("+", "-"), 
     col = c("blue", "red"), pch = 20)


# ----------------------------------
# A 2D example - Branin-Hoo function
# ----------------------------------

# 30-points DoE, and the corresponding response
d <- 2
nb_PX <- 30
require(DiceDesign)
X <- lhsDesign(nb_PX, d, seed = 123)$design
Xopt <- maximinSA_LHS(X, T0 = 10, c = 0.99, it = 10000)
x <- Xopt$design
require(DiceKriging)
fx <- apply(x, 1, branin)
f <- ifelse(fx < 14, -1, 1)
Xf <- as.matrix(x)

# Fit and create a GPC model without parallelisation
t0 <- proc.time()
GPCmodel <- gpcm(f, Xf, multistart = 3, seed = 123)
t1 = proc.time() - t0
cat(" time elapsed : ",t1[3])
print(GPCmodel)

# Graphics - Predict probabilities
ngrid <- 50
x.grid <- seq(0, 1., length.out = ngrid)
grid <- as.matrix(expand.grid(x.grid, x.grid))
probabilities <- predict(GPCmodel, newdata = grid, light.return = TRUE)
filled.contour(x.grid, x.grid, matrix(probabilities, ngrid, ngrid),
               color.palette = function(n) hcl.colors(n, "RdYlBu", rev = FALSE),
               main = "probabilities map",
               plot.axes = {
                 axis(1)
                 axis(2)
                 points(Xf[f == 1, 1], Xf[f == 1, 2], col = "blue", pch = 21, bg = "blue")
                 points(Xf[f == -1, 1], Xf[f == -1, 2], col = "red", pch = 21, bg = "red")
               }
)

# Fit and create a GPC model with parallelisation
## Use multisession futures
require(future)
plan(multisession)
t0 = proc.time()
GPCmodel2 <-  gpcm(f,Xf, multistart = 3, seed = 123 )
t1 = proc.time() - t0
cat(" time elapsed : ",t1[3])
print(GPCmodel2)
## Explicitly close multisession workers by switching plan
plan(sequential)

GPCsign documentation built on April 4, 2025, 1:55 a.m.