README.md

HpTuning

'HPTuning' is a program that performs hyperparameter tuning of Machine Learning (ML) classification algorithms using different optimization techniques. The project uses the 'mlr' [01] package as structure, and works as a wrapper to the techniques provided by literature.

Technical Requirements

Setup

You can install the current project, please use the following command inside your R session:

devtools::install_github("rgmantovani/HpTuning")

How it works?

Basically, given a 4-tuple of execution parameters \<datafile, algo, tuning, epoch>, it will tune the \<algo> on \<datafile> using the \<tuning> technique. The \<epoch> parameter specifies the seed of the repetition being executed. Since most of the tuning techniques covered here are stochastic, when comparing them they need to run several times with different seeds.

Each execution (or single job) will be saved in a different folder and organized by its input parameters. The program will store at the disk: the final performance values reached by the tuned models; the predictions reached by the tuned models; the hyper-parameters returned by the optimization process; and the optimization path with all the candidate settings evaluated during search.

Available Options

There is no restriction regarding the datafile option: the code will run with the datasets provided by you and located at the data sub-folder. Obs: On every datafile, the target attribute must be the last one and labeled as Class.

The available options (in this current version) for the other runtime parameters are:

Running the code

To run the project, please, call it by the bash file executing it by the command:

R CMD BATCH --no-save --no-restore '--args' --datafile=<datafile> --algo=<algo> --tuning=<tuning> \
  --epoch=<epoch> mainHP.R out_job.log &  

It will start the script saving the status in an output log file. You can follow the execution and errors checking directly this file, and also change the name of this log file as you wish.

Contact

Rafael Gomes Mantovani (rgmantovani@gmail.com / rafaelmantovani@utfpr.edu.br) Universidade Tecnológica Federal do Paraná (UTFPR) - Apucarana, Brazil.

References

[01] B. Bischl, Michel Lang, Lars Kotthoff, Julia Schiffner, Jakob Richter, Zachary Jones, Giuseppe Casalicchio. mlr: Machine Learning in R, R package version 2.10. URL https://github.com/mlr-org/mlr.

[02] J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization, J. Mach. Learn. Res. 13 (2012) 281–305.

[03] J. Snoek, H. Larochelle, R. P. Adams, Practical bayesian optimization ofmachine learning algorithms, in: F. Pereira, C. Burges, L. Bottou, K. Weinberger (Eds.), Advances in Neural Information Processing Systems 25, Cur-ran Associates, Inc., 2012, pp. 2951–2959.

[04] M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez Cáceres, T. Stützle, and M.Birattari. The irace package: Iterated Racing for Automatic Algorithm Configuration. Operations Research Perspectives, 2016.

[05] J. Kennedy, R. Eberhart, Particle swarm optimization. In: Proceedingsof the IEEE International Conference on Neural Networks, Vol. 4, Perth,Australia, 1995, pp. 1942 – 1948.

[06] D. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison Wesley, 1989.

[07] M. Hauschild, M. Pelikan. An introduction and survey of estimation of distribution algorithms, Swarm and Evolutionary Computation (3) (2011)111 – 128.



rgmantovani/HpTuning documentation built on June 1, 2021, 10:50 p.m.