batch_random_search: Perform batch random search using a paramset.

Description Usage Arguments Details Value

Description

If your parameter search is likely to take a long time, this function allows you to do it in batches, saving the result of the search results to disk after each batch. This incurs a penalty in running time, because the assessment splits are recomputed (or 'baked' in 'recipes'). terminology at the beginning of each batch. The smaller the batch size, the bigger the penalty.

Usage

1
2
3
batch_random_search(resamples, recipe, param_set, n, scoring_func, ...,
  batch_size, out_folder = ".", file_prefix = "batch_", overwrite = FALSE,
  verbosity = TRUE)

Arguments

resamples

A data.frame with columns 'splits' and 'id', created using the 'rsample' package.

recipe

The recipe to use. See package 'recipes'.

param_set

Param set created by calling ParamHelpers::makeParamset.

n

Number of parameter combinations to generate.

scoring_func

Your custom train/predict/score function. Must take as parameters:

  • a training dataframe

  • the name of the target variable in the training dataframe

  • a list of parameters (these are the hyperparameters we are tuning)

  • an evaluation dataframe

  • dots. These are additional non-tunable parameters that could be passed to the function.

...

Optional params passed to train_predict_func.

batch_size

Size of the batches.

out_folder

Where to save the intermediate batch results. Folder will be created if not found.

file_prefix

Used to name the results files.

overwrite

Overwrite existing results files or create new ones.

verbosity

Integer: level of verbosity, or TRUE/FALSE for max/min verbosity.

Details

'scoring_func' can return a single score as a numeric vector, or multiple scores in a data.frame. The output folder will be scanned for files corresponding to pattern <file_prefix>_n.RDS. If overwrite is false, the outputs of the current run will be witten to files starting at n + 1. Otherwise it starts at 1 (i.e. <file_prefix_1.RDS). Option verbose will print the batch number at the beginning of each batch.

Value

A tidy data.frame, the aggregate result. This is the same as without the batches.


artichaud1/cook documentation built on May 21, 2019, 9:23 a.m.