README.md

autoxgboost - Automatic tuning and fitting of xgboost.

Build Status Coverage Status CRAN Status Badge CRAN Downloads

General overview

autoxgboost aims to find an optimal xgboost model automatically using the machine learning framework mlr and the bayesian optimization framework mlrMBO.

Work in progress!

Benchmark

|Name | Factors| Numerics| Classes| Train instances| Test instances |-----------------|-------------|--------------|-------------|---------------------|-------------------- |Dexter | 20 000| 0| 2| 420| 180 |GermanCredit | 13| 7| 2| 700| 300 |Dorothea | 100 000| 0| 2| 805| 345 |Yeast | 0| 8| 10| 1 038| 446 |Amazon | 10 000| 0| 49| 1 050| 450 |Secom | 0| 591| 2| 1 096| 471 |Semeion | 256| 0| 10| 1 115| 478 |Car | 6| 0| 4| 1 209| 519 |Madelon | 500| 0| 2| 1 820| 780 |KR-vs-KP | 37| 0| 2| 2 237| 959 |Abalone | 1| 7| 28| 2 923| 1 254 |Wine Quality | 0| 11| 11| 3 425| 1 469 |Waveform | 0| 40| 3| 3 500| 1 500 |Gisette | 5 000| 0| 2| 4 900| 2 100 |Convex | 0| 784| 2| 8 000| 50 000 |Rot. MNIST + BI | 0| 784| 10| 12 000| 50 000

Datasets used for the comparison benchmark of autoxgboost, Auto-WEKA and auto-sklearn.

|Dataset | baseline| autoxgboost| Auto-WEKA| auto-sklearn |-----------------|-----------------------|------------------------|------------------------|------------------------ |Dexter | 52,78| 12.22| 7.22| 5.56 |GermanCredit | 32.67| 27.67| 28.33| 27.00 |Dorothea | 6.09| 5.22| 6.38| 5.51 |Yeast | 68.99| 38.88| 40.45| 40.67 |Amazon | 99.33| 26.22| 37.56| 16.00 |Secom | 7.87| 7.87| 7.87| 7.87 |Semeion | 92.45| 8.38| 5.03| 5.24 |Car | 29,15| 1.16| 0.58| 0.39 |Madelon | 50.26| 16.54| 21.15| 12.44 |KR-vs-KP | 48.96| 1.67| 0.31| 0.42 |Abalone | 84.04| 73.75| 73.02| 73.50 |Wine Quality | 55.68| 33.70| 33.70| 33.76 |Waveform | 68.80| 15.40| 14.40| 14.93 |Gisette | 50.71| 2.48| 2.24| 1.62 |Convex | 50.00| 22.74| 22.05| 17.53 |Rot. MNIST + BI | 88.88| 47.09| 55.84| 46.92

Benchmark results are median percent error across 100 000 bootstrap samples (out of 25 runs) simulating 4 parallel runs. Bold numbers indicate best performing algorithms.

autoxgboost - How to Cite

The Automatic Gradient Boosting framework was presented at the ICML/IJCAI-ECAI 2018 AutoML Workshop (poster). Please cite our ICML AutoML workshop paper on arxiv. You can get citation info via citation("autoxgboost") or copy the following BibTex entry:

@inproceedings{autoxgboost,
  title={Automatic Gradient Boosting},
  author={Thomas, Janek and Coors, Stefan and Bischl, Bernd},
  booktitle={International Workshop on Automatic Machine Learning at ICML},
  year={2018}
}


ja-thomas/autoxgboost documentation built on April 9, 2020, 11:10 p.m.