method: Initialize method or type of the model

methodR Documentation

Initialize method or type of the model

Description

Functions for initializing the method or type of the model, which can then be passed to gp_init. The supported methods are:

method_full

Full GP, so full exact covariance function is used, meaning that the inference will be for the n latent function values (fitting time scales cubicly in n).

method_fitc

Fully independent training (and test) conditional, or FITC, approximation (see QuiƱonero-Candela and Rasmussen, 2005; Snelson and Ghahramani, 2006). The fitting time scales O(n*m^2), where n is the number of data points and m the number of inducing points num_inducing. The inducing point locations are chosen using the k-means algorithm.

method_rf

Random features, that is, linearized GP. Uses random features (or basis functions) for approximating the covariance function, which means the inference time scales cubicly in the number of approximating basis functions num_basis. For stationary covariance functions random Fourier features (Rahimi and Recht, 2007) is used, and for non-stationary kernels using case specific method when possible (for example, drawing the hidden layer parameters randomly for cf_nn). For cf_const and cf_lin this means using standard linear model, and the inference is performed on the weight space (not in the function space). Thus if the model is linear (only cf_const and cf_lin are used), this will give a potentially huge speed-up if the number of features is considerably smaller than the number of data points.

Usage

method_full()

method_fitc(
  inducing = NULL,
  num_inducing = 100,
  bin_along = NULL,
  bin_count = 10,
  seed = 12345
)

method_rf(num_basis = 400, seed = 12345)

Arguments

inducing

Inducing points to use. If not given, then num_inducing points will be placed in the input space using a clustering algorithm.

num_inducing

Number of inducing points for the approximation. Will be ignored if the inducing points are given by the user.

bin_along

Either an index or a name of the input variable along which to bin the values before placing the inducing inputs. For example, if bin_along=3, then the input data is divided into bin_count bins along 3rd input variable, and each bin will have the same number inducing points (or as close as possible). This can sometimes be useful to ensure that inducing points are spaced evenly with respect to some particular variable, for example time in spatio-temporal models.

bin_count

The number of bins to use if bin_along given. Has effect only if bin_along is given.

seed

Random seed for reproducible results.

num_basis

Number of basis functions for the approximation.

Value

The method object.

References

Rahimi, A. and Recht, B. (2008). Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20.

QuiƱonero-Candela, J. and Rasmussen, C. E (2005). A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research 6:1939-1959.

Snelson, E. and Ghahramani, Z. (2006). Sparse Gaussian processes using pseudo-inputs. In Advances in Neural Information Processing Systems 18.

Examples



#' # Generate some toy data
# NOTE: this is so small dataset that in reality there would be no point
# use sparse approximation here; we use this small dataset only to make this
# example run fast
set.seed(1242)
n <- 50
x <- matrix(rnorm(n * 3), nrow = n)
f <- sin(x[, 1]) + 0.5 * x[, 2]^2 + x[, 3]
y <- f + 0.5 * rnorm(n)
x <- data.frame(x1 = x[, 1], x2 = x[, 2], x3 = x[, 3])

# Full exact GP with Gaussian likelihood
gp <- gp_init(cf_sexp())
gp <- gp_optim(gp, x, y)

# Approximate solution using random features (here we use a very small 
# number of random features only to make this example run fast)
gp <- gp_init(cf_sexp(), method = method_rf(num_basis = 30))
gp <- gp_optim(gp, x, y)

# Approximate solution using FITC (here we use a very small 
# number of incuding points only to make this example run fast)
gp <- gp_init(cf_sexp(), method = method_fitc(num_inducing = 10))
gp <- gp_optim(gp, x, y)


gplite documentation built on Aug. 24, 2022, 9:07 a.m.