optimize_gp: Train a GP via type-II maximum likelihood.

Description Usage Arguments Value

View source: R/optimize_gp.R

Description

Train a full or sparse GP via type-II maximum likelihood.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
optimize_gp(
  y,
  xy,
  cov_fun = "sqexp",
  cov_par_start,
  mu,
  family = "gaussian",
  nugget = TRUE,
  sparse = FALSE,
  xu_opt = NULL,
  xu = NULL,
  muu = NULL,
  vi = FALSE,
  opt = NULL,
  verbose = FALSE,
  file_path = NA,
  ...
)

Arguments

y

Vector of the observed response values.

xy

Matrix of the observed input/covariate values. Rows correspond to elements of y.

cov_fun

Chacter string specifying the GP covariance function. Either "sqexp" for squared exponential, or "ard" for automatic relevance determination are currently supported.

cov_par_start

Named list of initial values for the covariance parameters. Always requires "sigma" and "tau" which are the standard deviations for the latent GP and the noise, respectively. In the case of the squared exponential covariance function, "l" must also be specified. In the case of the ARD covariance function, "l1", "l2", ..., "ld" must be specified where d is the dimension of the input space/number of columns of xy.

mu

Vector of the same length of y specifying the mean of the GP.

family

Character string either "gaussian", "bernoulli", or "poisson" specifying the type of data that y is.

nugget

Logical value indicating whether to estimate the nugget alongside other covariance parameters.

sparse

Logical value indicating whether to fit a sparse or full GP.

xu_opt

Character string denoting how to select knots in the case of a sparse model. "fixed" fixes the knots to initial values. "random" uses the OAT knot selection algorithm with a best of random subset proposal. "oat" uses Bayesian optimization to propose a knot. "simultaneous" simultaneously optimizes knots alongside covariance parameters.

xu

Matrix of initial knots. Number of columns should match that of xy.

muu

Vector of the marginal mean of the GP at the knots.

vi

Logical value indicating whether to fit the sparse model using variational inference. This only works for Gaussian data.

opt

A named list of additional options to control the optimization algorithm and the knot selection procedure. The only arguments that will likely need to be set are "delta", "maxknot", and potentially "TTmax". All others should typically work well with default values.

decay: Controls how quickly previous gradients are forgotten. Values equal to zero recovers gradient ascent, i.e. previous gradients are forgotten entirely. Values close to 1 indicate that the current gradient has little influence over the step direction. Defaults to 0.95.

epsilon: Small, positive value partially controlling the step size. Defaults to 1e-6. eta: Additional parameter >= 1 added to Adadelta that shrinks step sizes when gradients oscillate between negative and positive values. If equal to 1, recovers Adadelta. Defaults to 1e3.

maxit: Maximum number of gradient ascent iterations. Defaults to 1000.

obj_tol: Objective function tolerance. Convergence declared when change in the objective function falls below this value. Defaults to 1e-3.

grad_tol: Gradient tolerance. Additional tolerance parameter controlling convergence of gradient ascent. Convergence declared if objective tolerance is met AND all absolute gradients fall below a threshold. Defaults to Inf so that only obj_tol controls convergence.

maxit_nr: Maximum number of Newton-Raphson iterations if data is non-Gaussian.

delta: Small, positive quantity added to the marginal GP variances to ensure well-conditioned matrices. Defaults to 1e-6. Numerical problems can occur when sigma^2 / (tau^2 + delta) > 1e6.

tol_nr: Objective function and gradient tolerance level for the Newton-Raphson algorithm. Defaults to 1e-5. Small tolerance levels here are recommended, but can sometimes slow down training.

maxknot: Maximum number of allowable knots

cov_diff: Logical value indicating whether to treat the covariance function as if it is not differentiable so that each added knot is not optimized continuously

chooseK: Logical value indicating whether to select the total number of knots or continue adding knots until maxknot is reached

TTmax: Maximum number of candidate knot proposals to test in the OAT algorithm

TTmin: Minimum number of candidate knot proposals to test in the OAT algorithm

ego_cov_par: Named list of initial covariance parameters values for the meta GP in the case that Bayesian optimization is used for knot proposals. Values should not need to be changed from the default.

verbose

Logical value indicating whether to print the iteration and the current covariance parameter/knot gradient value.

file_path

Character string denoting the path to the file where you would like to save the trained model.

...

Additional, model dependent arguments. Currently the only use for this is with Poisson response values. In this case, the user should specify the variable m, which is a vector of length equal to y. This is a part of the mean of the Poisson where E(Y) = m*exp(f(x)) and f(x) is the value of the latent GP at x.

Value

Returns a list of values corresponding to the fitted GP. These include:

sparse: logical value indicating whether the model is sparse or not

family: Character string indicating the conditional distribution of the data given the latent function

delta: A number corresponding to the value of delta used in the fitted model to stabilize relevant matrix operations.

xu_init: A matrix of initial knot values in the case that a sparse model was fit.

results: A list with the following elements:

results$cov_par: A named list with the fitted covariance parameter values

results$cov_fun: Character string indicating the covariance function.

results$xu: The fitted, final knot locations

results$obj_fun: A vector of objective function values for every gradient ascent step if OAT is not used, or the optimized objective function after adding each knot if OAT is used.

results$u_mean: The posterior mean of the latent function at the knots.

results$u_var: The posterior variance-covariance matrix of the latent function values at the knots.

results$muu: The marginal mean of the latent function at the knots.

results$mu: The marginal mean of the latent function at the observed data locations

results$cov_par_history: Either the covariance parameter values at every gradient ascent step if OAT is not used, or the optimized covariance parameter values after adding each knot if OAT is used.

results$ga_steps: A vector giving the number of gradient ascent steps, potentially for every added knot if OAT is used.

results$u_post: a list where each element is a list showing the posterior mean and variance-covariance matrix of the latent function at the knots after adding each knot


nategarton13/sparseRGPs documentation built on May 27, 2020, 9:46 a.m.