GPCLA: Gaussian Process Classification Model Using the Laplace...

Description Details Super class Public fields Methods

Description

Approximate inference for GP classification using the Laplace approximation; this inference method can be used with a likelihood that accepts only values -1 and 1, such as LikLogis.

Details

In Gaussian process (GP) classification, we consider the likelihood of outcomes y \in \{-1, 1\} to be

∏_i σ ( y_i f(x_i) ),

where σ is a function mapping the reals to [0, 1]. In many dichotomous models in R, the dichotomous values are constrained to be in {0, 1}; we follow the standard practice in GP models of instead using {-1, 1} as this is more convenient and efficient for likelihood-related computation. If y is provided with some other set of dichotomous values, the outcomes are transformed to {-1, 1}, with a warning (see the formula argument of the new() method for details).

We place a GP prior over the values of f(X), which says that before observing y, we believe the values of f(X) are distributed

f (X) ~ N ( m (X), K (X, X) ),

where m ( X ) is a mean function and K ( X, X ) is a covariance function. In other words, the prior mean over function outputs is a function of the predictors, as is the prior covariance over function outputs.

Unlike with GP regression, we cannot simply apply Bayes' rule and derive an anlyitical posterior. However, we can use Laplace's method to approximate the posterior with a normal distribution:

f | y, X ~ N(μ + K \nabla log p(y | \hat{f}), (K^{-1} + W)^{-1} ),

where K denotes the prior covariance function (and we have suprressed in the notation that here it is evaluated at X), μ is the prior mean of f, and W = -\nabla \nabla log p(y | \hat{f}) . For new test cases X*, f* is distributed

f* | y, X, X* ~ N(μ* + \nabla log p(y | \hat{f}), K(X*) - K(X*, X) (K + W^{-1})^{-1} K(X, X*)),

where μ* is the prior mean function evaluated at X*. (See Rasmussen and Williams (2006), Section 3.4 for details).

Super class

gpmss::GPModel -> GPCLA

Public fields

y

The outcomes; should be a numeric vector. The outcome variable should only have -1 and 1 values. If the outcome has only two values that are not 1 and -1, the highest value is reassigned the value 1 and the lowest value is reassigned the value -1, with a warning. This field is usually generated automatically during object construction and does not generally need to be interacted with directly by the user.

X

The predictors; should be a numeric matrix. This field is usually generated automatically during object construction and does not generally need to be interacted with directly by the user.

terms

The terms object related to the formula provided in model construction (this is useful for later prediction).

force_all_levs

A logical vector of length one recording the user's preference for dropping unused factor levels

meanfun

The prior mean function; must be an object of class MeanFunction

covfun

The prior covariance function; must be an object of class CovarianceFunction

likfun

The likelihood function; must be an object of class LikelihoodFunction

L

A numeric matrix such that L L^T = I + W^{1/2} K W^{1/2}

alpha

A numeric vector equal to \nabla log p(y | \hat{f})

sW

A numeric matrix equal to W^{1/2}

post_mean

A numeric vector giving the posterior mean at the training data

post_cov

A numeric vector giving the posterior covariance at the training data

prior_mean

A numeric vector giving the prior mean at the training data

marginal_effects

A list with an element for each predictor in the model marginal effects have been requested for (via the margins() method). Each element is itself a list. If marginal effects on the latent function scale have been requested, this is a list of length two, with an element "mean", each entry i of which gives the mean of the distribution of the marginal effect of the predictor on the ith observation, and an element "covariance", giving the covariance matrix for the distribution of marginal effects of that predictor. If marginal effects on the probability scale have been requested, this is a list of length one with element "draws", giving the draws of the requested marginal effects (see the Details section of the margins() method).

average_marginal_effects

A dataframe with a row for each predictor in the model that marginal effects have been requested for (via the margins() method) and columns: "Variable", giving the name of the predictor; "Mean", giving the mean of the average marginal effect of that predictor; "LB", giving the lower bound on the requested confidence interval (see the margins() method); and "UB", giving the upper bound on the CI.

Methods

Public methods

Inherited methods

Method new()

Create a new GPModel object

Usage
GPCLA$new(
  formula,
  data,
  likfun = LikLogis,
  meanfun = MeanZero,
  covfun = CovSEard,
  optimize = FALSE,
  force_all_levs = FALSE,
  ...
)
Arguments
formula

A formula object giving the variable name of the outcomes on the left-hand side and the predictors on the right-hand side, a la lm. Note the outcome variable should only have -1 and 1 values. If the outcome has only two values that are not 1 and -1, the highest value is reassigned the value 1 and the lowest value is reassigned the value -1, with a warning.

data

An optional data frame where the variables in formula are to be found. If not found there, we search for the variables elsewhere, generally the calling environment.

likfun

An object inheriting from class LikelihoodFunction, or a generator for such a class. This is the likelihood function for the GP model. The default is LikLogis. If a generator is provided rather than an object, an object will be created with the default value for the hyperparameters.

meanfun

An object inheriting from class MeanFunction, or a generator for such a class. This is the mean function for the GP prior (see Details). The default is MeanZero. If a generator is provided rather than an object, an object will be created with the default value for the hyperparameters.

covfun

An object inheriting from class CovarianceFunction, or a generator for such a class. This is the covariance function for the GP prior (see Details). The default is CovSEard. If a generator is provided rather than an object, an object will be created with the default value for the hyperparameters.

optimize

A logical vector of length one; if TRUE, the hyperparameters of the mean, covariance, and likelihood functions are optimized automatically as part of the model construction process (see Details). The default is FALSE, meaning the hyperparameters given at the time of creating those objects are used for inference.

force_all_levs

A logical vector of length one; if TRUE, unused factor levels in right-hand side variables are not dropped. The default is FALSE.

...

Other arguments specific to a particular model; unused. Train the GP model, providing a characterization of the posterior of the function of interest at the input values given in the training data.


Method train()

Usage
GPCLA$train(...)
Arguments
...

Additional arguments affecting the inference calculations (usused).

Details

The implementation here follows very closely Algorithm 3.1 in Rasmussen and Williams (2006). The main difference is that like the MATLAB/Octave software GPML, we use successive over-relaxation in the Newton steps, with an adaptive relaxation factor, though the implementation differs somewhat from GPML.


Method predict()

Characterize the posterior predictive distribution of the function of interest at new test points.

Usage
GPCLA$predict(newdata, ...)
Arguments
newdata

A data frame containing the data for the new test points

...

Additional arguments affecting the predictions produced

Details

The implementation here follows very closely Algorithm 3.2 in Rasmussen and Williams (2006).


Method nlml()

Caclulate the negative log marginal likelihood of the GP model.

Usage
GPCLA$nlml(...)
Arguments
...

Additional arguments affecting the calculation


Method dnlml()

Caclulate the gradient of the negative log marginal likelihood of the GP model with respect to the hyperparameters of the mean, covariance, and likelihood functions.

Usage
GPCLA$dnlml(...)
Arguments
...

Additional arguments affecting the calculation

Details

The implementation here follows very closely Algorithm 5.1 in Rasmussen and Williams (2006).


Method margins()

Caclulate marginal effects of predictors.

Usage
GPCLA$margins(
  variables = NULL,
  base_categories = NULL,
  differences = NULL,
  indices = NULL,
  ci = 0.95,
  force = FALSE,
  type = "link",
  M = 1000,
  ...
)
Arguments
variables

A vector specifying the variables marginal effects are desired for; can be an integer vector, giving the column indices of X to get effects for, or a character vector; if NULL (the default), effects are derived for all variables.

base_categories

If X contains contrasts, the marginal effects will be the differences between the levels and a base level. By default, the base level is the lowest factor level of the contrast, but you can pass a named list to change the base level for some or all of the variables assigned to a contrast scheme.

differences

A named list of 2-length numeric vectors may be provided giving the low (first element of each vector) and high (second element) values to calculate the effect at for continuous variables. Any elements giving values for binary or categorical variables are ignored (this is meaningless for binary variables as there are only two values to choose from, and categorical variables should be controlled with the option base_categories). If NULL (the default), derivatives are used for marginal effects of continuous predictors.

indices

A numeric vector of indices over which to average marginal effects. If NULL (the default), all observations are used.

ci

A numeric vector of length one such that 0 < ci < 1 giving the width of the confidence interval for the average marginal effects. The default is 0.95, corresponding to a 95% confidence interval.

force

A logical vector of length one; should marginal effects for variables who have already had effects calculated be re-calculated? The default is FALSE. (This is useful in case the user has requested effects for continuous variables be calculated as differences but would now prefer derivatives, or vice versa).

type

A character vector of length one; should be one of "link" or "response". If "link", marginal effects are calculated on the latent function scale (i.e. change in f). If "reponse", marginal effects are calculated on the probability scale (i.e. change in probability of positive response). The default is "link".

M

The number of marginal effect draws to take if type == "response". The default is 1000.

...

Other arguments affecting the calculations (unused)

Details

The derivative of a GP is also a GP, giving us easy access to the distribution of marginal effects of predictors. The first time this method is called, it calculates and stores the distribution of pointwise partial derivatives of f for each specified predictor in the model (if no predictors are specified, marginal effects are calculated for all predictors).

A user can request marginal effects on the probability scale, i.e. change in probability of positive outcome, by specifying type = "response", or marginal effects on the latent function scale, i.e. change in f, by specifying type = "link". The latter are sometimes called "partial effects." The partial effects can be calculated directly, but the marginal effects must be simulated.

If a predictor is binary, instead the distribution of the difference in f between the predictor taking the binary variable's highest value and its lowest value is calculated. (Or, if type = "response", the distribution in the difference of a positive response is simulated). If a predictor is categorical, a similar calculation is made, but comparing each of the labels except one to a baseline label. The user may specify to use a similar calculation for two user specified values for continuous variables rather than using the partial derivatives (in some cases this may be more easily interpretable). Additional calls to margins() may calculate marginal effects for predictors whose marginal effects had not yet been requested, but marginal effects for variables that have already been calculated are not re-calculated unless force = TRUE.

Every time the method is called, it stores a dataframe of average marginal effects; it calculates the mean, lower bound, and upper bound (the width of the confidence interval is specifiable) of the marginal effect of each predictor over all observations, or over specified indices of observations if provided.


Method clone()

The objects of this class are cloneable with this method.

Usage
GPCLA$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


duckmayr/gpmss documentation built on Nov. 8, 2021, 5:48 a.m.