generate_new_runs: Generate Simulator Runs
In Tandethsquire/emulatorr: Emulation and History Matching Package

Description Usage Arguments Details Value See Also Examples

A wrapper for a variety of sampling methods. Given a set of trained emulators, finds the next set of points that will be informative for the next wave of emulators.

generate_new_runs(
  emulators,
  ranges,
  n_points = 10 * length(ranges),
  z,
  method = "importance",
  include_line = TRUE,
  cutoff = 3,
  nth = 1,
  plausible_set,
  burn_in = FALSE,
  verbose = TRUE,
  ...
)

`emulators`	A list of `Emulator` objects, trained on the design points
`ranges`	The ranges of the input parameters
`n_points`	Optional. Specifies how many additional points are required. Default: 10*(number of emulators)
`z`	Checks implausibility of sample points to restrict to only non-implausible points.
`method`	Any of 'lhs', 'slice', 'optical'.
`include_line`	Should line sampling be applied after point generation? Default: TRUE.
`cutoff`	Optional. If z is given, this is the implausibility cutoff for the filtering. Default = 3
`nth`	Optiional. To be passed to the n parameter of nth implausible. Default = 1.
`plausible_set`	Optional - a set of non-implausible points from which to start.
`burn_in`	If importance sampling, should a burn-in phase be used? Default: FALSE
`verbose`	Should progress statements by made? Default: TRUE
`...`	Any parameters that need to be passed to a particular method (see below)

If the method is 'lhs', this creates a new training set using LHS, and then finds the trace of the variance matrix of the emulators across these points (this is broadly equivalent to using V-optimality). We repeat this for n_runs, and select the configuration that minimises the mean of the variances across the emulators. If observations are given, then these are used to ensure that the new sample points are non-implausible according to the current emulators.

If the method is 'slice', then a known set of non-implausible points plausible_set must be provided. It then applies slice sampling, using implausibility as a measure of success.

If the method is 'optical', then the optical depth of the space in each parameter direction is calculated (using a known set of non-implausible points plausible_set), and used as a distribution for that parameter. Points are sampled from the collection of distributions and non-implausible points generated are filtered out. From the remaining points, a sample of the required size is generated using maximin criterion.

If the method is 'importance', importance sampling is used. Starting from a set of non-implausible (preferably space-filling) points, points are sampled from a distribution around the points, and included in the output based on a weighted measure gained from the mixture distribution of the initial points. The set plausible_set must be specified. If burn_in is TRUE, then a burn-in phase is used to determine the optimal parameters for the proposal distribution.

Note that the plausible_set parameter size differs between the methods that use it. The optical set should be as large as possible in order to accurately represent the optical depth in each parameter direction; the set for importance sampling and slice sampling should be smaller (and probably smaller than the desired number of output points) in order to expedite the initial set-up of the sampling strategy.

For any sampling strategy, the parameters emulators, ranges and z must be specified.

If line_sample is TRUE, then the boundaries of the space are explored as follows. The plausible set provided (or that generated by LHS with rejection) is used as a base set, and lines are chosen connecting points in the set. A number of points are sampled along these lines (extending beyond the given points) and are tested for non-implausibility. Any that lie on the edge of the non-implausible region are added to the set.

These methods will not necessarily work if the target space is very small, or it may miss parts of the target space if it is disconnected. For such target spaces, consider using the much more computationally intensive IDEMC.

A data.frame containing the set of new points to simulate at.

IDEMC for point generation in small target regions.

ranges <- list(aSI = c(0.1, 0.8), aIR = c(0, 0.5), aSR = c(0, 0.05))
ems <- emulator_from_data(GillespieSIR, output_names = c('nS', 'nI', 'nR'),
 ranges = ranges, quadratic = TRUE)
trained_ems <- purrr::map(seq_along(ems),
 ~ems[[.x]]$adjust(GillespieSIR, c('nS', 'nI', 'nR')[[.x]]))
targets <- list(
 list(val = 281, sigma = 10.43),
 list(val = 30, sigma = 11.16),
 list(val = 689, sigma = 14.32)
)
non_imp_points <- GillespieImplausibility[GillespieImplausibility$I <= 4, names(ranges)]

pts_default <- generate_new_runs(trained_ems, ranges, 10, targets, cutoff = 3)
pts_lhs <- generate_new_runs(trained_ems, ranges, 10, targets, cutoff = 3, method = 'lhs')
pts_slice <- generate_new_runs(trained_ems, ranges, 10, targets,
 method = 'slice', cutoff = 4, plausible_set = non_imp_points, include_line = FALSE)
pts_optical <- generate_new_runs(trained_ems, ranges, 10, targets,
 method = 'optical', cutoff = 4, plausible_set = non_imp_points, include_line = FALSE)
non_imp_sample <- non_imp_points[sample(seq_along(non_imp_points[,1]), 20),]
pts_importance <- generate_new_runs(trained_ems, ranges, 10, targets,
 method = 'importance', cutoff = 4, plausible_set = non_imp_sample, include_line = FALSE)