emulator_from_data | R Documentation |
Given data from simulator runs, generates a set of Emulator
objects,
one for each output.
emulator_from_data(
input_data,
output_names,
ranges,
input_names = names(ranges),
emulator_type = NULL,
specified_priors = NULL,
order = 2,
beta.var = FALSE,
corr_name = "exp_sq",
adjusted = TRUE,
discrepancies = NULL,
verbose = interactive(),
na.rm = FALSE,
check.ranges = TRUE,
targets = NULL,
has.hierarchy = FALSE,
covariance_opts = NULL,
...
)
input_data |
Required. A data.frame containing parameter and output values |
output_names |
Required. A character vector of output names |
ranges |
Required if input_names is not given. A named list of input parameter ranges |
input_names |
Required if ranges is not given. The names of the parameters |
emulator_type |
Selects between deterministic, variance, covariance, and multistate emulation |
specified_priors |
A collection of user-determined priors (see description) |
order |
To what polynomial order should regression surfaces be fitted? |
beta.var |
Should uncertainty in the regression coefficients be included? |
corr_name |
If not exp_sq, the name of the correlation structures to fit |
adjusted |
Should the return emulators be Bayes linear adjusted? |
discrepancies |
Any known internal or external discrepancies of the model |
verbose |
Should status updates be provided? |
na.rm |
If TRUE, removes output values that are NA |
check.ranges |
If TRUE, modifies ranges to a conservative minimum enclosing hyperrectangle |
targets |
If provided, outputs are checked for consistent over/underestimation |
has.hierarchy |
Internal - distinguishes deterministic from hierarchical emulators |
covariance_opts |
User-specified options for emulating covariance matrices |
... |
Any additional parameters for custom correlators or additional verbosity options |
Many of the parameters that can be passed to this function are optional: the minimal operating
example requires input_data
, output_names
, and one of ranges
or
input_names
. If ranges
is supplied, the input names are intuited from that list,
data.frame, or data.matrix; if only input_names
is supplied, then ranges are
assumed to be [-1, 1] for each input.
The ranges can be provided in a few different ways: either as a named list of length-2
numeric vectors (corresponding to upper and lower bounds for each parameter); as a
data.frame with 2 columns and each row corresponding to a parameter; or as a data.matrix
defined similarly as the data.frame. In the cases where the ranges are provided as a
data.frame or data.matrix, the row.names
of the data object must be provided, and
a warning will be given if not.
If the set (input_data, output_names, ranges)
is provided and nothing else,
then emulators are fitted as follows. The basis functions and associated regression
coefficients are generated using linear regression up to quadratic order, allowing for
cross-terms. These regression parameters are assumed 'known'.
The correlation function c(x, x') is assumed to be exp_sq
and a corresponding
Correlator
object is created. The hyperparameters of the correlation
structure are determined using a constrained maximum likelihood argument. This determines
the variance, correlation length, and nugget term.
The maximum allowed order of the regression coefficients is controlled by order
;
the regression coefficients themselves can be deemed uncertain by setting
beta.var = TRUE
(in which case their values can change in the hyperparameter
estimation); the hyperparameter search can be overridden by specifying ranges for
each using hp_range
.
In the presence of expert beliefs about the structure of the emulators, information
can be supplied directly using the specified_priors
argument. This can contain
specific regression coefficient values beta
and regression functions func
,
correlation structures u
, hyperparameter values hyper_p
and nugget term
values delta
.
Some rudimentary data handling functionality exists, but is not a substitute for
sense-checking input data directly. The na.rm
option will remove rows of
training data that contain NA values if true; the check.ranges
option allows
a redefinition of the ranges of input parameters for emulator training if true. The
latter is a common practice in later waves of emulation in order to maximise the
predictive power of the emulators, but should only be used if it is believed that
the training set provided is truly representative of and spans the full space of
interest.
Various different classes of emulator can be created using this function, depending
on the nature of the model. The emulator_type
argument accepts a few different
options:
Create emulators for the mean and variance surfaces, for each stochastic output
Create emulators for the mean surface, and a covariance matrix for the variance surface
Create sets of emulators per output for multistate stochastic systems
Deterministic emulators with no covariance structure
The "default" behaviour will apply if the emulator_type
argument is not supplied, or
does not match any of the above options. If the data provided looks to display stochasticity,
but default behaviour is used, a warning will be generated and only the first model result
for each individual parameter set will be used in training.
For examples of this function's usage (including optional argument behaviour), see the examples.
An appropriately structured list of Emulator
objects
# Deterministic: use the SIRSample training dataset as an example.
ranges <- list(aSI = c(0.1, 0.8), aIR = c(0, 0.5), aSR = c(0, 0.05))
out_vars <- c('nS', 'nI', 'nR')
ems_linear <- emulator_from_data(SIRSample$training, out_vars, ranges, order = 1)
ems_linear # Printout of the key information.
# Stochastic: use the BirthDeath training dataset
v_ems <- emulator_from_data(BirthDeath$training, c("Y"),
list(lambda = c(0, 0.08), mu = c(0.04, 0.13)), emulator_type = 'variance')
# If different specifications are wanted for variance/expectation ems, then
# enter a list with entries 'variance', 'expectation'. Eg corr_names
v_ems_corr <- emulator_from_data(BirthDeath$training, c("Y"),
list(lambda = c(0, 0.08), mu = c(0.4, 0.13)), emulator_type = 'variance',
corr_name = list(variance = "matern", expectation = "exp_sq")
)
# Excessive runtime
ems_quad <- emulator_from_data(SIRSample$training, out_vars, ranges)
ems_quad # Now includes quadratic terms
ems_cub <- emulator_from_data(SIRSample$training, out_vars, ranges, order = 3)
ems_cub # Up to cubic order in the parameters
ems_unadjusted <- emulator_from_data(SIRSample$training, out_vars, ranges, adjusted = FALSE)
ems_unadjusted # Looks the same as ems_quad, but the emulators are not Bayes Linear adjusted
# Reproduce the linear case, but with slightly adjusted beta values
basis_f <- list(
c(function(x) 1, function(x) x[[1]], function(x) x[[2]]),
c(function(x) 1, function(x) x[[1]], function(x) x[[2]]),
c(function(x) 1, function(x) x[[1]], function(x) x[[3]])
)
beta_val <- list(
list(mu = c(550, -400, 250)),
list(mu = c(200, 200, -300)),
list(mu = c(200, 200, -50))
)
ems_custom_beta <- emulator_from_data(SIRSample$training, out_vars, ranges,
specified_priors = list(func = basis_f, beta = beta_val)
)
# Custom correlation functions
corr_structs <- list(
list(sigma = 83, corr = Correlator$new('exp_sq', list(theta = 0.5), nug = 0.1)),
list(sigma = 95, corr = Correlator$new('exp_sq', list(theta = 0.4), nug = 0.25)),
list(sigma = 164, corr = Correlator$new('matern', list(theta = 0.2, nu = 1.5), nug = 0.45))
)
ems_custom_u <- emulator_from_data(SIRSample$training, out_vars, ranges,
specified_priors = list(u = corr_structs))
# Allowing the function to choose hyperparameters for 'non-standard' correlation functions
ems_matern <- emulator_from_data(SIRSample$training, out_vars, ranges, corr_name = 'matern')
# Providing hyperparameters directly
matern_hp <- list(
list(theta = 0.8, nu = 1.5),
list(theta = 0.6, nu = 2.5),
list(theta = 1.2, nu = 0.5)
)
ems_matern2 <- emulator_from_data(SIRSample$training, out_vars, ranges, corr_name = 'matern',
specified_priors = list(hyper_p = matern_hp))
# "Custom" correaltion function with user-specified ranges: gamma exponential
# Any named, defined, correlation function can be passed. See Correlator documentation
ems_gamma <- emulator_from_data(SIRSample$training, out_vars, ranges, corr_name = 'gamma_exp',
specified_priors = list(hyper_p = list(gamma = c(0.01, 2), theta = c(1/3, 2))))
# Multistate emulation: use the stochastic SIR dataset
SIR_names <- c("I10", "I25", "I50", "R10", "R25", "R50")
b_ems <- emulator_from_data(SIR_stochastic$training, SIR_names,
ranges, emulator_type = 'multistate')
# Covariance emulation, with specified non-zero matrix elements
which_cov <- matrix(rep(TRUE, 16), nrow = 4)
which_cov[2,3] <- which_cov[3,2] <- which_cov[1,4] <- which_cov[4,1] <- FALSE
c_ems <- emulator_from_data(SIR_stochastic$training, SIR_names[-c(3,6)], ranges,
emulator_type = 'covariance', covariance_opts = list(matrix = which_cov))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.