optimize.bayesian: Bayesian Optimization
In Laurae2/Laurae2: Laurae's Simplified R

Description Usage Arguments Details Examples

optimize.bayesian performs bayesian optimization on a given loss function and parameters. This is a wrapper to mlrMBO optimizer for quick and fast optimization.

optimize.bayesian(loss_func, param_set, seed = 1, maximize = FALSE,
  absurd_value = ifelse(maximize == TRUE, -Inf, Inf), exp_design = NULL,
  initialization = nrow(exp_design), max_evaluations = 50,
  time_budget = NULL, verbose = TRUE)

`loss_func`	Type: function. The loss function. Takes as input a list of elements.
`param_set`	Type: ParamHelpers::makeParamSet. The parameters to optimize. Check out ParamHelpers::makeParamSet for more information.
`seed`	Type: numeric. Seed for random number generation. Defaults to `1`.
`maximize`	Type: logical. Whether to minimize (`FALSE`) or maximize (`TRUE`) the loss function. Defaults to `FALSE`.
`absurd_value`	Type: numeric. An absurd value when an error or invalid value is provided by the loss function which is the reverse of the best possible loss value. Defaults to `ifelse(maximize == TRUE, -Inf, Inf)`.
`exp_design`	Type: matrix. The default matrix used to initialize the optimizer. Defaults to `NULL`.
`initialization`	Type: numeric. The number of `loss_func` initializations to provide a start point for the bayesian optimizer. Defaults to `nrow(exp_design)`.
`max_evaluations`	Type: numeric. The number of times the loss function can be evaluated, including the initialization part. Defaults to `50`.
`time_budget`	Type: numeric. The maximum time spent in the loss function globally. The optimization hands down the results to the user when the time budget is exhausted or the maximum number of evaluations is reached. Defaults to `NULL`.
`verbose`	Type: logical. Whether to print debug messages. Defaults to `TRUE`.

Please see vignette for demos: vignette("optimize.bayesian", package = "Laurae2") or help_me("optimize.bayesian").

## Not run: 
library(xgboost)
library(mlrMBO)

# Load demo data
data(EuStockMarkets)

# Transform dataset to "quantiles"
for (i in 1:4) {
  EuStockMarkets[, i] <- (ecdf(EuStockMarkets[, i]))(EuStockMarkets[, i])
}

# Create datasets: 1500 observations for training, 360 for testing
# Features are:
# -- Deutscher Aktienindex (DAX),
# -- Swiss Market Index (SMI),
# -- and Cotation Assistee en Continu (CAC)
# Label is Financial Times Stock Exchange 100 Index (FTSE)
dtrain <- xgb.DMatrix(EuStockMarkets[1:1500, 1:3], label = EuStockMarkets[1:1500, 4])
dval <- xgb.DMatrix(EuStockMarkets[1501:1860, 1:3], label = EuStockMarkets[1501:1860, 4])

# Create watchlist for monitoring metric
watchlist <- list(train = dtrain, eval = dval)

# Our loss function to optimize: minimize RMSE
xgboost_optimization <- function(x) {

  # Train the model
  gc(verbose = FALSE)
  set.seed(1)
  model <- xgb.train(params = list(max_depth = x[1],
                                   subsample = x[2],
                                   tree_method = x[3],
                                   eta = 0.2,
                                   nthread = 1,
                                   objective = "reg:linear",
                                   eval_metric = "rmse"),
                     data = dtrain, # Warn: Access using parent environment
                     nrounds = 9999999,
                     watchlist = watchlist, # Warn: Access using parent environment
                     early_stopping_rounds = 5,
                     verbose = 0)
  score <- model$best_score
  rm(model)
  return(score)

}

# The paramters: max_depth in [1, 15], subsample in [0.1, 1], and tree_method IN {exact, hist}
my_parameters <- makeParamSet(
  makeIntegerParam(id = "max_depth", lower = 1, upper = 15),
  makeNumericParam(id = "subsample", lower = 0.1, upper = 1),
  makeDiscreteParam(id = "tree_method", values = c("exact", "hist"))
)

# Perform optimization
optimization <- optimize.bayesian(loss_func = xgboost_optimization,
                                  param_set = my_parameters,
                                  seed = 1,
                                  maximize = FALSE,
                                  initialization = 10,
                                  max_evaluations = 25,
                                  time_budget = 30,
                                  verbose = TRUE)

## End(Not run)