In berndbischl/mlrMBO: Bayesian Optimization and Model-Based Optimization of Expensive Black-Box Functions

In Depth Introduction

We will use mlrMBO to minimize the two dimensional branin function with three global minimas.

set.seed(1)
library(mlrMBO)
library(ggplot2)
objfun1 = makeBraninFunction()
autoplot(objfun1, render.levels = TRUE, show.optimum = TRUE)

The following steps are needed to start a surrogate-based optimization with our package. Each step ends with an R object, which is then passed to mbo(), i.e., to the working horse of mlrMBO.

define the objective function and its parameters by using the package smoof
generate an initial design
define a learner, i.e., the surrogate model
set up a MBO control object
finally start the optimization

Step 2 and 3 are optional as mbo() will use default settings if no value is supplied. This tutorial page will provide you with an in-depth introduction on how to set the mbo() parameters for different kind of optimizations.

Objective Function

The first argument of mbo() is the the objective function that has to be minimized. It has to be a smoof function from the package of the same name. The package brings some predefined functions that are quite popular for benchmarking.

objfun1 = makeBraninFunction()

Custom objective function

You can create your own smoof function easily using makeSingleObjectiveFunction (or makeMultiObjectiveFunction for multi-objective optimization) from the package smoof.

# just an example
makeSingleObjectiveFunction(
  name = "Sphere Function",
  fn = function(x) sum(x[1]^2 + sin(x[2])),
  par.set = makeNumericParamSet("x", len = 2L, lower = -3L, upper = 3L)
)

For more details see ?makeSingleObjectiveFunction.

Initial Design

The second argument of the mbo() function - design - is the initial design with default setting NULL.

An easy (and recommended) way to create an initial design is to use the generateDesign function from the ParamHelpers package. If no design is given (i.e. design = NULL) a Maximin Latin Hypercube lhs::maximinLHS design is used with n = 4 * getNumberOfParameters(objfun1) points. Other possibilities to generate designs are for example generateGridDesign and generateRandomDesign.

Note: If special designs are desired (e.g., orthogonal designs), they can be given as a data.frame but you have to be aware that the output resembles the output of generateDesign.

For our objective function objfun1 we create a slightly larger number of initial points than the default suggests.

n = 5 * getNumberOfParameters(objfun1)
design1 = generateDesign(n = n, par.set = getParamSet(objfun1), fun = lhs::randomLHS)

If the design does not include the results of the objective function mbo will automatically calculate them in advance. Sometimes it makes sense to precalculate the results because you might want to reuse the design. In our case it is quite easy:

design1$y = apply(design1, 1, objfun1)

Surrogate Model

In our example we use Kiriging as a surrogate because it is the most common for numerical model-based optimization and has prooven to be quite effective. We use mlr to generate the Kriging regression learner from the package DiceKriging.

surr.km = makeLearner("regr.km", predict.type = "se", covtype = "matern3_2", control = list(trace = FALSE))

In fact you can use any other regression learner from mlr as a surrogate. Depending on the infill criterion we will set later it is important that the learner can predict the uncertainty (defined by predict.type = "se") alongside the mean prediction. Another popular surrogate is the random forest. It's use is explained in the section for mixed space optimization.

MBOControl

The MBOControl object controls the optimization process and is created with makeMBOControl. General control arguments can be set when creating it.

n.objectives: The number of objectives. 1 implies normal single criteria optimization and is covered in this page. For values >1 see multi-objective optimization.
propose.points: The number of evaluated points in each iteration. The default is 1 and refers to the standard SMBO process. Higher values suggest that you want to do parallelization.
final.method: Defines how the final solution is proposed from the finished optimization process.

All parameters are documented in ?makeMBOControl.

MBOControlInfill

With setMBOControlInfill you can change the deault infill criterion settings set in the MBOControl object. It is highly recomended to adjust the settings to suit your optimization problem and the surrogate model.

Argument crit

One of the most important questions is to define how the next design points in the sequential loop are chosen. 5 different infill criteria can be set via the crit argument in setMBOControlInfill:

mean: mean response of the surrogate model
ei: expected improvement of the surrogate model, which is the recomended setting if you use a Kriging surrogate.
aei: augmented expected improvement, which is especially useful for noisy functions
eqi: expected quantile improvement
cb: confidence bound, which is the additive combination of mean response and mean standard error estimation of the surrogate model (response - lambda * standard.error)

Here you can also further configure the infill criterion (e.g. crit.cb.lambds for the lambda parameter if crit = cb).

Note: When using Kriging as a surrogate, numerical problems can occur if training points are too close to each other. To circumvent this problem you can set filter.proposed.points to TRUE. Then points closer then the value of filter.proposed.points.tol to an already evaluated point will be replaced with a random point.

Argument opt

The key idea behind model-based optimization is to substitute the expensive optimization on the black-box with optimization on the surrogate as this is deemed to be cheaper. To optimize the infill criterion on the surrogate we also need an optimizer. The optimum of the infill criterion function gives us the next point that Which one to use can be defined with the opt argument. The possibilities are:

focussearch: A Latin Hypercube design of size opt.focussearch.points (default 10000) is sampled in the parameter space (by randomLHS) and the design point with the best prediction of the infill criterion is determined. Then, the parameter space is shrunk around this point. This step is repeated opt.focussearch.maxit (default 5) times and the best observed point is passed back.
cmaes: The optimal point is found with a covariance matrix adapting evolutionary strategy from the cmaes package. If the strategy fails, a random point is generated and a warning is given. Further control arguments can be provided in opt.cmaes.control as a list.
ea: Use an evolutionary multiobjective optimization algorithm from the package emoa to determine the best point. The population size mu can be set by opt.ea.mu (default value is 10). (mu+1) means that in each population only one child is generated using crossover und mutation operators. The parameters eta and p of the latter two operators can be adjusted via the attributes opt.ea.sbx.eta, opt.ea.sbx.p,opt.ea.pm.eta and opt.ea.pm.p. The default number of EA iterations is 500 and can be changed by opt.ea.maxit attribute.
nsga2: Use the non-dominated sorting genetic algorithm from the package nsga2R to determine the best point. This algorithm should be used for multi objective optimization.

As all four infill optimization strategies do not guarantee to find the global optimum, users can set the number of restarts by the opt.restarts argument (default value is 1). After conducting the desired number of restarts the point with the best infill criterion value is passed back to the MBO loop.

Note: Only the focussearch optimizer is suitable for for categorical parameters in the parameter set.

setMBOControlTermination

With this control function different criteria to stop the fitting process can be specified. You can set multiple different criteria and the first one that is met will terminate the optimization process. You can set:

iters: The maximum number of iterations
time.budget: A maximum running time in seconds
target.fun.value: A treshold for function evaluation (stop if a evaluation is better than a given value)
max.evals: The maximum number of function evaluations

Note: You can also easily create your own stopping condition(s).

setMBOControlMultiPoint

This extends a MBO control object with options for multi-point proposal. Multi-point proposal means, that multiple points are proposed and evaluated, which is especially useful for parallel batch evaluation. For a detailed introduction, check the multi-point tutorial.

Argument: method

Define the method used for multi-point proposals, currently 3 different methods are supported:

cb: Proposes multiple points by optimizing the confidence bound criterion propose.points times with different lambda values. Generally this works the same way as for the single point case, i.e. specify infill.opt. The lambdas are drawn from an exp(1)-distribution.
multicrit: Use a evolutionary multicriteria optimization. This is a (mu+1) type evolutionary algorithm and runs for multicrit.maxit generations. The population size is set to propose.points.
cl: Proposes points by the constant liar strategy, which only makes sense if the confidence bound criterion is used as an infill criterion. In the first step the surrugate model is fitted based on the real data and the best point is calculated accordingly. Then, the function value of the best point is simply guessed by the worst seen function evaluation. This "lie"" is used to update the model in order to propose subsequent point. The procedure is applied until the number of points has reached propose.points.

setMBOControlMultiCrit

This adds multi-criteria optimization specific options to the control object. For details see the tutorial page on multi-criteria optimization.

note: The list of all attributes is provided in the software documentation.

Experiments and Output

Now we will apply the mbo() function to optimize the two objective functions.

control1 = makeMBOControl()
control1 = setMBOControlInfill(
  control = control1,
  crit = "ei"
)
control1 = setMBOControlTermination(
  control = control1,
  iters = 10
)

Optimization of objfun1

mbo(objfun1, design = design1, learner = surr.km, control = control1, show.info = FALSE)

The default output of mbo contains the best found parameter set and the optimzation path. The MBOResult object contains additional information, most importantly:

x: The best point of the parameter space
y: The associated best value of the objective function
opt.path: The optimization path. See ParamHelpers::OptPath for further information.
models: Depending on store.model.at in the MBOControl object, this contains zero, one or multiple surrogate models (default is to save the model generated after the last iteration).
...

We can also change some arguments of the MBOControl object and run mbo() again:

control1 = setMBOControlInfill(control1, crit = "cb")
control1 = setMBOControlTermination(control1, iters = 5L)
mbo(objfun1, design = design1, learner = surr.km, control = control1, show.info = FALSE)

Finally, if a learner, which does not support the se prediction type, should be applied for the optimization with the ei infill criterion, it is possible to create a bagging model. For details on how to do it take a look at the bagging section in the mlr tutorial.

berndbischl/mlrMBO documentation built on Oct. 11, 2022, 1:44 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

berndbischl/mlrMBO
Bayesian Optimization and Model-Based Optimization of Expensive Black-Box Functions

In berndbischl/mlrMBO: Bayesian Optimization and Model-Based Optimization of Expensive Black-Box Functions

In Depth Introduction

Objective Function

Custom objective function

Initial Design

Surrogate Model

MBOControl

MBOControlInfill

Argument crit

Argument opt

setMBOControlTermination

setMBOControlMultiPoint

Argument: method

setMBOControlMultiCrit

Experiments and Output

Optimization of objfun1

R Package Documentation

Browse R Packages

We want your feedback!

berndbischl/mlrMBO Bayesian Optimization and Model-Based Optimization of Expensive Black-Box Functions

In berndbischl/mlrMBO: Bayesian Optimization and Model-Based Optimization of Expensive Black-Box Functions

In Depth Introduction

Objective Function

Custom objective function

Initial Design

Surrogate Model

MBOControl

MBOControlInfill

Argument crit

Argument opt

setMBOControlTermination

setMBOControlMultiPoint

Argument: method

setMBOControlMultiCrit

Experiments and Output

Optimization of objfun1

R Package Documentation

Browse R Packages

We want your feedback!

berndbischl/mlrMBO
Bayesian Optimization and Model-Based Optimization of Expensive Black-Box Functions