rf_opt: Bayesian Optimization for Random Forest

Description Usage Arguments Value Examples

Description

This function estimates parameters for Random Forest based on bayesian optimization.

Usage

1
2
3
4
5
rf_opt(train_data, train_label, test_data, test_label, num_tree = 500L,
  mtry_range = c(1L, ncol(train_data) - 1), min_node_size_range = c(1L,
  as.integer(sqrt(nrow(train_data)))), init_points = 4, n_iter = 10,
  acq = "ei", kappa = 2.576, eps = 0, optkernel = list(type =
  "exponential", power = 2))

Arguments

train_data

A data frame for training of Random Forest

train_label

The column of class to classify in the training data

test_data

A data frame for training of xgboost

test_label

The column of class to classify in the test data

num_tree

The range of the number of trees for forest. Defaults to 500 (no optimization).

mtry_range

Value of mtry used. Defaults from 1 to number of features.

min_node_size_range

The range of minimum node sizes to best tested. Default min is 1 and max is sqrt(nrow(train_data)).

init_points

Number of randomly chosen points to sample the target function before Bayesian Optimization fitting the Gaussian Process.

n_iter

Total number of times the Bayesian Optimization is to repeated.

acq

Acquisition function type to be used. Can be "ucb", "ei" or "poi".

  • ucb GP Upper Confidence Bound

  • ei Expected Improvement

  • poi Probability of Improvement

kappa

tunable parameter kappa of GP Upper Confidence Bound, to balance exploitation against exploration, increasing kappa will make the optimized hyperparameters pursuing exploration.

eps

tunable parameter epsilon of Expected Improvement and Probability of Improvement, to balance exploitation against exploration, increasing epsilon will make the optimized hyperparameters are more spread out across the whole range.

optkernel

Kernel (aka correlation function) for the underlying Gaussian Process. This parameter should be a list that specifies the type of correlation function along with the smoothness parameter. Popular choices are square exponential (default) or matern 5/2

Value

The test accuracy and a list of Bayesian Optimization result is returned:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(MlBayesOpt)

set.seed(71)
res0 <- rf_opt(train_data = iris_train,
               train_label = Species,
               test_data = iris_test,
               test_label = Species,
               mtry_range = c(1L, ncol(iris_train) - 1),
               num_tree = 10L,
               init_points = 10,
               n_iter = 1)

ymattu/MlBayesOpt documentation built on May 4, 2019, 5:31 p.m.