find_model: Find fitting models and test them using given metrics on the...

Description Usage Arguments Value

View source: R/model_finder.R

Description

Find fitting models and test them using given metrics on the test dataset

Usage

1
2
3
4
5
6
7
8
find_model(train, test, response, models, metrics,
  parameter_sample_rate = 1, seed = 1, prepend_data_checker = T,
  on_missing_column = c("error", "add")[1],
  on_extra_column = c("remove", "error")[1],
  on_type_error = c("ignore", "error")[1], verbose = T,
  save_model = F, preprocess_pipes = list(function(train, test)
  return(list(train = train, test = train, .predict = function(data)
  return(data)))))

Arguments

train

The training dataset

test

The testing dataset

response

The response column as a string

models

A list of models. Each model should be a list, containing at least a training function .train and a .predict function, plus named vectors of parameters to explore.

The .train function has to take a data argument that stores the training data and a ... argument for the parameters. The .predict function needs to take two arguments, where the first is the model and the second the new dataset.

If a parameter only takes a single value, you can use a vector to store options. Otherwise use a list.

You can use model_trainer as a wrapper for this list. It will also test your inputs.

metrics

A list of metrics (functions) that need to be calculated on the train and test response and predictions

parameter_sample_rate

Optional parameter. If set in the range (0,1]), it will be used to sample the possible combinations of parameters

seed

Random seed to set each time before a model is trained

prepend_data_checker

Flag indicating if pipe_check should be prepended before all pipelines.

on_missing_column

See pipe_check for details.

on_extra_column

See pipe_check for details.

on_type_error

See pipe_check for details.

verbose

Should intermediate updates be printed.

save_model

Flag indicating if the generated models should be saved. Defaults to False.

preprocess_pipes

List of preprocessing pipelines generated using pipeline.

Value

A dataframe containing the training function, a list of parameters used to train the function, and one column for each metric / dataset combination.


jeroenvdhoven/datapiper documentation built on July 14, 2019, 9:34 p.m.