optimize.bayesian: Bayesian Optimization

Description Usage Arguments Details Examples

Description

optimize.bayesian performs bayesian optimization on a given loss function and parameters. This is a wrapper to mlrMBO optimizer for quick and fast optimization.

Usage

1
2
3
4
optimize.bayesian(loss_func, param_set, seed = 1, maximize = FALSE,
  absurd_value = ifelse(maximize == TRUE, -Inf, Inf), exp_design = NULL,
  initialization = nrow(exp_design), max_evaluations = 50,
  time_budget = NULL, verbose = TRUE)

Arguments

loss_func

Type: function. The loss function. Takes as input a list of elements.

param_set

Type: ParamHelpers::makeParamSet. The parameters to optimize. Check out ParamHelpers::makeParamSet for more information.

seed

Type: numeric. Seed for random number generation. Defaults to 1.

maximize

Type: logical. Whether to minimize (FALSE) or maximize (TRUE) the loss function. Defaults to FALSE.

absurd_value

Type: numeric. An absurd value when an error or invalid value is provided by the loss function which is the reverse of the best possible loss value. Defaults to ifelse(maximize == TRUE, -Inf, Inf).

exp_design

Type: matrix. The default matrix used to initialize the optimizer. Defaults to NULL.

initialization

Type: numeric. The number of loss_func initializations to provide a start point for the bayesian optimizer. Defaults to nrow(exp_design).

max_evaluations

Type: numeric. The number of times the loss function can be evaluated, including the initialization part. Defaults to 50.

time_budget

Type: numeric. The maximum time spent in the loss function globally. The optimization hands down the results to the user when the time budget is exhausted or the maximum number of evaluations is reached. Defaults to NULL.

verbose

Type: logical. Whether to print debug messages. Defaults to TRUE.

Details

Please see vignette for demos: vignette("optimize.bayesian", package = "Laurae2") or help_me("optimize.bayesian").

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
## Not run: 
library(xgboost)
library(mlrMBO)

# Load demo data
data(EuStockMarkets)

# Transform dataset to "quantiles"
for (i in 1:4) {
  EuStockMarkets[, i] <- (ecdf(EuStockMarkets[, i]))(EuStockMarkets[, i])
}

# Create datasets: 1500 observations for training, 360 for testing
# Features are:
# -- Deutscher Aktienindex (DAX),
# -- Swiss Market Index (SMI),
# -- and Cotation Assistee en Continu (CAC)
# Label is Financial Times Stock Exchange 100 Index (FTSE)
dtrain <- xgb.DMatrix(EuStockMarkets[1:1500, 1:3], label = EuStockMarkets[1:1500, 4])
dval <- xgb.DMatrix(EuStockMarkets[1501:1860, 1:3], label = EuStockMarkets[1501:1860, 4])

# Create watchlist for monitoring metric
watchlist <- list(train = dtrain, eval = dval)

# Our loss function to optimize: minimize RMSE
xgboost_optimization <- function(x) {

  # Train the model
  gc(verbose = FALSE)
  set.seed(1)
  model <- xgb.train(params = list(max_depth = x[1],
                                   subsample = x[2],
                                   tree_method = x[3],
                                   eta = 0.2,
                                   nthread = 1,
                                   objective = "reg:linear",
                                   eval_metric = "rmse"),
                     data = dtrain, # Warn: Access using parent environment
                     nrounds = 9999999,
                     watchlist = watchlist, # Warn: Access using parent environment
                     early_stopping_rounds = 5,
                     verbose = 0)
  score <- model$best_score
  rm(model)
  return(score)

}

# The paramters: max_depth in [1, 15], subsample in [0.1, 1], and tree_method IN {exact, hist}
my_parameters <- makeParamSet(
  makeIntegerParam(id = "max_depth", lower = 1, upper = 15),
  makeNumericParam(id = "subsample", lower = 0.1, upper = 1),
  makeDiscreteParam(id = "tree_method", values = c("exact", "hist"))
)

# Perform optimization
optimization <- optimize.bayesian(loss_func = xgboost_optimization,
                                  param_set = my_parameters,
                                  seed = 1,
                                  maximize = FALSE,
                                  initialization = 10,
                                  max_evaluations = 25,
                                  time_budget = 30,
                                  verbose = TRUE)

## End(Not run)

Laurae2/Laurae2 documentation built on May 18, 2019, 11:27 p.m.