gp: Gaussian process models

Description Usage Arguments Details Value Examples

Description

Fitting, summarizing and predicting from Gaussian process models using a range of inference methods.

Usage

1
2
3
4
5
6
7
gp(formula, data, family = gaussian, weights = NULL, mean_function = NULL,
  inducing_data = NULL, inference = c("default", "full", "FITC", "Laplace"),
  hyperinference = c("default", "BFGS", "BFGSrestarts", "none"),
  verbose = FALSE)

## S3 method for class 'gp'
print(x, ...)

Arguments

formula

an object of class formula giving a symbolic description of the model to be fitted - the right hand side must be a valid kernel object or the commands to construct one.

data

a data frame containing the covariates against which to model the response variable. This must have the same number of rows as response and contain named variables matching those referred to by kernel.

family

a family object giving the likelihood and link function to be used to fit the model. Currently only gaussian, poisson and binomial (only Bernoulli) are supported.

weights

an optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector.

mean_function

an optional function specifying the prior over the mean of the gp, in other words a 'first guess' at what the true function is. This must act on a dataframe with named variables matching some of those in data and return a vector giving a single value for each row in the dataframe. Note that this function must return a prediction on the scale of the link, rather than the response. If NULL then a prior mean is assumed to be 0 for all observations.

inducing_data

an optional dataframe containing the locations of inducing points to be used when carrying out sparse inference (e.g. FITC). This must contain variables with names matching those referenced by kernel and mean_function. This should have fewer rows than data and response.

inference

a string specifying the inference method to be used to estimate the values of the latent parameters. If 'default' an appropriate method is picked for the likelihood specified. See details section for the list of default inference methods.

hyperinference

the method to be used for inference on the hyperparameters (parameters of the kernel). BFGS carries out straightforward optimisation starting from the current kernel parameters using gradient-free BFGS. Because the likelihood surface is rarely convex, this is is not advised for general use. BFGSrestarts runs gradient-free BFGS 5 times, each starting with a different randomly chosen set of parameters, which might be a bit better. none does no inference on the hyperparameters. default currently switches to none in all cases, though in future it may depend on other arguments.

verbose

whether to return non-critical information to the user during model fitting.

x

an object of class gp, constructed by the function gp giving a fitted gaussian process model

...

additional arguments for compatibility with generic functions

Details

The default inference method for a model with the family gaussian(link = 'identity') is full direct inference ('full'), for binomial(link = 'logit') and binomial(link = 'probit') the default is full Laplace inference ('Laplace'; though note that only Bernoulli data is handled at the moment). Sparse inference can be carried out by specifying inference = 'FITC', this is currently only available for a model with a Gaussian likelihood.

Value

A fitted gp object for which there aren't yet any associated functions. But there will be.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# make some fake data
n <- 100  # observations
m <- 10  # inducing points

# dataframes
df <- data.frame(x = sort(runif(n, -5, 5)))
inducing_df <- data.frame(x = sort(runif(m, -5, 5)))
prediction_df <- data.frame(x = seq(min(df$x), max(df$x), len = 500))

# fake Gaussian response data
f <- sin(df$x)
y <- rnorm(n, f, 1)

# fit a full (non-sparse) GP model (without updating the hyperparameters) 
# as this is the default. Notice we add the observation error to the kernel.
m1 <- gp(y ~ rbf('x') + iid(), df, gaussian)

# fit another with FITC sparsity
m2 <- gp(y ~ rbf('x') + iid(), df, gaussian, inference = 'FITC', 
         inducing_data = df)
         
# summary stats, other associated functions still to come

# construct a poisson response variable
y2 <- rpois(n, exp(f))

# fit a GP model by Laplace approximation
# (note no observation error in this model)
m3 <- gp(y2 ~ rbf('x'), df, poisson)


print(m3)
m3

goldingn/gpe documentation built on May 17, 2019, 7:41 a.m.