xgb.grid: Hyper-parameter grid search for xgboost

Description Usage Arguments Value Author(s)

Description

Performing simple hyper-parameter grid search for xgboost. Model scoring can be done either with validation data or with V-fold cross-validation.

Usage

1
2
3
4
5
6
xgb.grid(param_grid, data, nrounds, nfold, label = NULL, missing = NA,
  prediction = FALSE, showsd = TRUE, metrics = list(), obj = NULL,
  feval = NULL, stratified = TRUE, folds = NULL, verbose = TRUE,
  early_stopping_rounds = NULL, maximize = NULL, callbacks = list(),
  search_criteria, seed = NULL, order_metric_name = NULL,
  validation_data = NULL, ...)

Arguments

param_grid

A named list with xgboost parameter names, consisting of vectors of hyper-parameter values. The dataset containing the grid of possible hyper-parameters for model training is formed internally by running purrr::cross_d(param_grid).

data

Same as in xgboost::xgb.train or xgboost::xgb.cv.

nrounds

Same as in xgboost::xgb.train or xgboost::xgb.cv.

nfold

Same as in xgboost::xgb.train or xgboost::xgb.cv.

label

Same as in xgboost::xgb.train or xgboost::xgb.cv.

missing

Same as in xgboost::xgb.train or xgboost::xgb.cv.

prediction

Same as in xgboost::xgb.train or xgboost::xgb.cv.

showsd

Same as in xgboost::xgb.train or xgboost::xgb.cv.

metrics

Same as in xgboost::xgb.train or xgboost::xgb.cv.

obj

Same as in xgboost::xgb.train or xgboost::xgb.cv.

feval

Same as in xgboost::xgb.train or xgboost::xgb.cv.

stratified

Same as in xgboost::xgb.train or xgboost::xgb.cv.

folds

Same as in xgboost::xgb.train or xgboost::xgb.cv.

verbose

Same as in xgboost::xgb.train or xgboost::xgb.cv.

early_stopping_rounds

Same as in xgboost::xgb.train or xgboost::xgb.cv.

maximize

Same as in xgboost::xgb.train or xgboost::xgb.cv.

callbacks

Same as in xgboost::xgb.train or xgboost::xgb.cv.

search_criteria

Define how to search over the grid of hyper-parameters. This should be the list with parameters controlling the grid search. Currently supported parameters are: 'strategy' and 'max_models'. Currently supported values for strategy are 'Cartesian' (covers the entire space of hyper-parameter combinations) or 'RandomDiscrete' (do a random search of all the combinations of hyper-parameters). 'max_models' parameter can be set to an integer >0 that defines the maximum number of models to be trained.

seed

Specify the seed to use for determining the random model order in random grid search.

order_metric_name

What is the name of the metric for ranking the final grid of model fits?

validation_data

Validation data to score the model performance while training with xgboost::xgb.train. Must be in the same format as data, see ?xgboost::xgb.train for additional information.

...

Other parameters passed on directly to either xgboost::xgb.train or xgboost::xgb.cv.

Value

A resulting grid search of model object fits in a form of a data.table with xgboost model fit objects saved in a list column named 'xgb_fit'. In addition, the output data.table contains the original hyper-parameters used as well as the model performance metrics assessed by xgboost. The dataset is sorted according to the order_metric_name.

Author(s)

The code for using tidyverse syntax for model grid search is borrowed and adapted from: https://drsimonj.svbtle.com/grid-search-in-the-tidyverse. The search_criteria idea is borrowed from h2o::h2o.grid.


osofr/longGriDiSL documentation built on May 24, 2019, 4:56 p.m.