emulator_from_data: Generate Prior Emulators from Data

Description Usage Arguments Details Value Examples

View source: R/modelbuilding.R

Description

Given data from a simulation, generates a set of Emulator objects based on fitted values.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
emulator_from_data(
  input_data,
  output_names,
  ranges,
  input_names = names(ranges),
  beta,
  u,
  c_lengths,
  funcs,
  bucov,
  deltas,
  ev,
  quadratic = TRUE,
  beta.var = FALSE,
  lik.method = "my"
)

Arguments

input_data

Required. A data.frame containing the input parameters and output values from a set of simulator runs.

output_names

Required. The list of outputs to emulate from input_data.

ranges

A named list of parameter ranges.

input_names

A list of input_names (if ranges is not provided).

beta

Optional: specifications for the regression coefficients, given as a list of lists list(mu, sigma) (a la Emulator specification).

u

Optional: the correlation structure for each output, given as a list of lists list(mu, sigma, corr).

c_lengths

Optional: a set of correlation lengths.

funcs

Optional: basis functions for the regression surface.

bucov

Optional: a list of functions giving the covariance between each of the beta parameters and u(x).

deltas

Optional: the nugget terms to include in u(x).

ev

Optional. Used for determining nugget terms in absence on delta

quadratic

Optional: should the regression surface be linear or quadratic? Default: F

beta.var

Optional: should the beta coefficient be assumed to be known or should model variance be included?

lik.method

Optional: method used to determine hyperparameters sigma and theta.

Details

Many of the parameters that can be passed to this function are optional; the bare minimum is input_data, output_names, and one of ranges or input_names. If ranges is specified, then the input names are taken from that; if only input_names is specified, then it is assumed that all input values in input_data are already scaled to [-1, 1].

If the minimum information is provided, then a model is fitted as follows.

The basis functons and regression coefficients are generated using the lm function using either only linear terms or up to quadratic terms (dependent on the value of quadratic), performing stepwise add or delete as appropriate; in either event, the AIC criteria is used to select the terms. The regression parameters thus derived are assumed to be known if beta.var=FALSE, so that beta$sigma = diag(0). Otherwise, the covariance matrix for the parameters is taken from vcov(model).

The correlation function c(x,x') is taken to be exp_sq; the correlation length is chosen using the Durham heuristic: this states that the correlation length should lie within [1/(n+1), 2/(n+1)] where n is the degree of the fitted surface (and the range of the parameter is [-1,1]). Maximum likelihod estimation is then applied to this range to find an acceptable correlation length, and the corresponding standard error is used as an estimate for the variance of the correlation structure. The expectation E[u(x)] is assumed to be 0.

If delta terms are provided, then the nugget terms for each emulator are defined using these. If they are not provided but a list of variabilities for each output are (in ev), then a rough estimate of the nugget terms is performed and the emulators obtain these terms. If neither is provided, the nugget terms are assumed to be identically zero for each emulator.

The covariance between beta and u(x) is assumed to vanish.

Value

A list of objects of class Emulator.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
 # Use the GillespieSIR dataset
 ranges <- list(aSI = c(0.1, 0.8), aIR = c(0, 0.5), aSR = c(0, 0.05))
 out_vars <- c('nS', 'nI', 'nR')
 ems_linear <- emulator_from_data(GillespieSIR, output_names = out_vars,
  ranges = ranges, quadratic = FALSE)
 ems_linear # Printout of the key information

 ems <- emulator_from_data(GillespieSIR, output_names = out_vars,
  ranges = ranges, quadratic = TRUE)
 ems # Now includes quadratic terms (but only where they're warranted)

 
 ems2 <- emulator_from_data(GillespieSIR, output_names = out_vars,
  ranges = ranges, c_lengths = c(0.55, 0.6, 0.59),
  deltas = c(0.1, 0.2, 0.2), quadratic = TRUE)
 ems2 # Broadly the same, but with the correlation structure modified.

 ems2_beta <- emulator_from_data(GillespieSIR, output_names = out_vars,
  ranges = ranges, c_lengths = c(0.55, 0.6, 0.59),
  deltas = c(0.1, 0.2, 0.2), quadratic = TRUE, beta.var = TRUE)
 

Tandethsquire/emulatorr documentation built on April 12, 2021, 1:08 a.m.