autoRLearn_: Advanced version of autoRLearn.

Description Usage Arguments Value

View source: R/autoRLearn_.R

Description

Tunes the hyperparameters of the desired algorithm/s using either hyperband or BOHB.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
autoRLearn_(
  df_train,
  df_test,
  maxTime = 10,
  models = c("randomForest", "naiveBayes", "boosting", "l2-linear-classifier", "svm"),
  optimizationAlgorithm = "hyperband",
  bw = 3,
  kde_type = "single",
  max_iter = 81,
  metric = "acc"
)

Arguments

df_train

Dataframe of the training dataset. Assumes it is in perfect shape with all numeric variables and factor response variable named "class".

df_test

Dataframe of the test dataset. Assumes it is in perfect shape with all numeric variables and factor response variable named "class".

maxTime

Float representing the maximum time the algorithm should be run (in minutes).

models

List of strings denoting which algorithms to use for the process:

  • "randomForest" - Random forests using the randomForest package

  • "ranger - Random forests using the ranger package (unstable)

  • "naiveBayes" - Naive bayes using the fastNaiveBayes package

  • "boosting" - Gradient boosting using xgboost

  • "l2-linear-classifier" - Linear primal Support vector machine from LibLinear

  • "svm" - RBF kernel svm from e1071

optimizationAlgorithm

- String of which hyperparameter tuning algorithm to use:

  • "hyperband" - Hyperband with uniformly initiated parameters

  • "bohb" - Hyperband with bayesian optimization as described on F. Hutter et al 2018 paper BOHB. Has extra parameters bw and kde_type

bw

- (only applies to BOHB) Double representing how much should the KDE bandwidth be widened. Higher values allow the algorithm to explore more hyperparameter combinations

kde_type

- (only applies to BOHB) String representing whether a model's hyperparameters should be tuned individually of each other or have their probability densities multiplied:

  • "single" - each hyperparameter has its own expected improvement calculated

  • "mixed" - all hyperparameters' probabilty densities are multiplied and only one mixed expected improvement is calculated

max_iter

- (affects both hyperband and BOHB) Integer representing the maximum number of iterations that one successive halving run can have

metric

String of the evaluation metric to be used in the model performance optimization:

  • "acc" - Accuracy,

  • "avg-fscore" - Average of F-Score of each label,

  • "avg-recall" - Average of Recall of each label,

  • "avg-precision" - Average of Precision of each label,

  • "fscore" - Micro-Average of F-Score of each label,

  • "recall" - Micro-Average of Recall of each label,

  • "precision" - Micro-Average of Precision of each label.

Value

List of Results


DataSystemsGroupUT/SmartML documentation built on Nov. 24, 2020, 1:31 p.m.