| workflow_map | R Documentation |
workflow_map() will execute the same function across the workflows in the
set. The various tune_*() functions can be used as well as
tune::fit_resamples().
workflow_map(
object,
fn = "tune_grid",
verbose = FALSE,
seed = sample.int(10^4, 1),
...
)
object |
A workflow set. |
fn |
The name of the function to run, as a character. Acceptable values are:
"tune_grid",
"tune_bayes",
"fit_resamples",
"tune_race_anova",
"tune_race_win_loss", or
"tune_sim_anneal". Note that users need not
provide the namespace or parentheses in this argument,
e.g. provide |
verbose |
A logical for logging progress. |
seed |
A single integer that is set prior to each function execution. |
... |
Options to pass to the modeling function. See details below. |
When passing options, anything passed in the ... will be combined with any
values in the option column. The values in ... will override that
column's values and the new options are added to the options column.
Any failures in execution result in the corresponding row of results to
contain a try-error object.
In cases where a model has no tuning parameters is mapped to one of the
tuning functions, tune::fit_resamples() will be used instead and a
warning is issued if verbose = TRUE.
If a workflow requires packages that are not installed, a message is printed
and workflow_map() continues with the next workflow (if any).
An updated workflow set. The option column will be updated with
any options for the tune package functions given to workflow_map(). Also,
the results will be added to the result column. If the computations for a
workflow fail, a try-catch object will be saved in place of the results
(without stopping execution).
The package supplies two pre-generated workflow sets, two_class_set
and chi_features_set, and associated sets of model fits
two_class_res and chi_features_res.
The two_class_* objects are based on a binary classification problem
using the two_class_dat data from the modeldata package. The six
models utilize either a bare formula or a basic recipe utilizing
recipes::step_YeoJohnson() as a preprocessor, and a decision tree,
logistic regression, or MARS model specification. See ?two_class_set
for source code.
The chi_features_* objects are based on a regression problem using the
Chicago data from the modeldata package. Each of the three models
utilize a linear regression model specification, with three different
recipes of varying complexity. The objects are meant to approximate the
sequence of models built in Section 1.3 of Kuhn and Johnson (2019). See
?chi_features_set for source code.
workflow_set(), as_workflow_set(), extract_workflow_set_result()
library(workflowsets)
library(workflows)
library(modeldata)
library(recipes)
library(parsnip)
library(dplyr)
library(rsample)
library(tune)
library(yardstick)
library(dials)
# An example of processed results
chi_features_res
# Recreating them:
# ---------------------------------------------------------------------------
data(Chicago)
Chicago <- Chicago[1:1195, ]
time_val_split <-
sliding_period(
Chicago,
date,
"month",
lookback = 38,
assess_stop = 1
)
# ---------------------------------------------------------------------------
base_recipe <-
recipe(ridership ~ ., data = Chicago) |>
# create date features
step_date(date) |>
step_holiday(date) |>
# remove date from the list of predictors
update_role(date, new_role = "id") |>
# create dummy variables from factor columns
step_dummy(all_nominal()) |>
# remove any columns with a single unique value
step_zv(all_predictors()) |>
step_normalize(all_predictors())
date_only <-
recipe(ridership ~ ., data = Chicago) |>
# create date features
step_date(date) |>
update_role(date, new_role = "id") |>
# create dummy variables from factor columns
step_dummy(all_nominal()) |>
# remove any columns with a single unique value
step_zv(all_predictors())
date_and_holidays <-
recipe(ridership ~ ., data = Chicago) |>
# create date features
step_date(date) |>
step_holiday(date) |>
# remove date from the list of predictors
update_role(date, new_role = "id") |>
# create dummy variables from factor columns
step_dummy(all_nominal()) |>
# remove any columns with a single unique value
step_zv(all_predictors())
date_and_holidays_and_pca <-
recipe(ridership ~ ., data = Chicago) |>
# create date features
step_date(date) |>
step_holiday(date) |>
# remove date from the list of predictors
update_role(date, new_role = "id") |>
# create dummy variables from factor columns
step_dummy(all_nominal()) |>
# remove any columns with a single unique value
step_zv(all_predictors()) |>
step_pca(!!stations, num_comp = tune())
# ---------------------------------------------------------------------------
lm_spec <- linear_reg() |> set_engine("lm")
# ---------------------------------------------------------------------------
pca_param <-
parameters(num_comp()) |>
update(num_comp = num_comp(c(0, 20)))
# ---------------------------------------------------------------------------
chi_features_set <-
workflow_set(
preproc = list(
date = date_only,
plus_holidays = date_and_holidays,
plus_pca = date_and_holidays_and_pca
),
models = list(lm = lm_spec),
cross = TRUE
)
# ---------------------------------------------------------------------------
chi_features_res_new <-
chi_features_set |>
option_add(param_info = pca_param, id = "plus_pca_lm") |>
workflow_map(resamples = time_val_split, grid = 21, seed = 1, verbose = TRUE)
chi_features_res_new
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.