workflow_map: Process a series of workflows
In workflowsets: Create a Collection of 'tidymodels' Workflows

workflow_map

R Documentation

Process a series of workflows

Description

workflow_map() will execute the same function across the workflows in the set. The various ⁠tune_*()⁠ functions can be used as well as tune::fit_resamples().

Usage

workflow_map(
  object,
  fn = "tune_grid",
  verbose = FALSE,
  seed = sample.int(10^4, 1),
  ...
)

Arguments

`object`	A workflow set.
`fn`	The name of the function to run, as a character. Acceptable values are: "tune_grid", "tune_bayes", "fit_resamples", "tune_race_anova", "tune_race_win_loss", or "tune_sim_anneal". Note that users need not provide the namespace or parentheses in this argument, e.g. provide `"tune_grid"` rather than `"tune::tune_grid"` or `"tune_grid()"`.
`verbose`	A logical for logging progress.
`seed`	A single integer that is set prior to each function execution.
`...`	Options to pass to the modeling function. See details below.

Details

When passing options, anything passed in the ... will be combined with any values in the option column. The values in ... will override that column's values and the new options are added to the options column.

Any failures in execution result in the corresponding row of results to contain a try-error object.

In cases where a model has no tuning parameters is mapped to one of the tuning functions, tune::fit_resamples() will be used instead and a warning is issued if verbose = TRUE.

If a workflow requires packages that are not installed, a message is printed and workflow_map() continues with the next workflow (if any).

Value

An updated workflow set. The option column will be updated with any options for the tune package functions given to workflow_map(). Also, the results will be added to the result column. If the computations for a workflow fail, a try-catch object will be saved in place of the results (without stopping execution).

Note

The package supplies two pre-generated workflow sets, two_class_set and chi_features_set, and associated sets of model fits two_class_res and chi_features_res.

The ⁠two_class_*⁠ objects are based on a binary classification problem using the two_class_dat data from the modeldata package. The six models utilize either a bare formula or a basic recipe utilizing recipes::step_YeoJohnson() as a preprocessor, and a decision tree, logistic regression, or MARS model specification. See ?two_class_set for source code.

The ⁠chi_features_*⁠ objects are based on a regression problem using the Chicago data from the modeldata package. Each of the three models utilize a linear regression model specification, with three different recipes of varying complexity. The objects are meant to approximate the sequence of models built in Section 1.3 of Kuhn and Johnson (2019). See ?chi_features_set for source code.

Examples


library(workflowsets)
library(workflows)
library(modeldata)
library(recipes)
library(parsnip)
library(dplyr)
library(rsample)
library(tune)
library(yardstick)
library(dials)

# An example of processed results
chi_features_res

# Recreating them:

# ---------------------------------------------------------------------------
data(Chicago)
Chicago <- Chicago[1:1195, ]

time_val_split <-
  sliding_period(
    Chicago,
    date,
    "month",
    lookback = 38,
    assess_stop = 1
  )

# ---------------------------------------------------------------------------

base_recipe <-
  recipe(ridership ~ ., data = Chicago) |>
  # create date features
  step_date(date) |>
  step_holiday(date) |>
  # remove date from the list of predictors
  update_role(date, new_role = "id") |>
  # create dummy variables from factor columns
  step_dummy(all_nominal()) |>
  # remove any columns with a single unique value
  step_zv(all_predictors()) |>
  step_normalize(all_predictors())

date_only <-
  recipe(ridership ~ ., data = Chicago) |>
  # create date features
  step_date(date) |>
  update_role(date, new_role = "id") |>
  # create dummy variables from factor columns
  step_dummy(all_nominal()) |>
  # remove any columns with a single unique value
  step_zv(all_predictors())

date_and_holidays <-
  recipe(ridership ~ ., data = Chicago) |>
  # create date features
  step_date(date) |>
  step_holiday(date) |>
  # remove date from the list of predictors
  update_role(date, new_role = "id") |>
  # create dummy variables from factor columns
  step_dummy(all_nominal()) |>
  # remove any columns with a single unique value
  step_zv(all_predictors())

date_and_holidays_and_pca <-
  recipe(ridership ~ ., data = Chicago) |>
  # create date features
  step_date(date) |>
  step_holiday(date) |>
  # remove date from the list of predictors
  update_role(date, new_role = "id") |>
  # create dummy variables from factor columns
  step_dummy(all_nominal()) |>
  # remove any columns with a single unique value
  step_zv(all_predictors()) |>
  step_pca(!!stations, num_comp = tune())

# ---------------------------------------------------------------------------

lm_spec <- linear_reg() |> set_engine("lm")

# ---------------------------------------------------------------------------

pca_param <-
  parameters(num_comp()) |>
  update(num_comp = num_comp(c(0, 20)))

# ---------------------------------------------------------------------------

chi_features_set <-
  workflow_set(
    preproc = list(
      date = date_only,
      plus_holidays = date_and_holidays,
      plus_pca = date_and_holidays_and_pca
    ),
    models = list(lm = lm_spec),
    cross = TRUE
  )

# ---------------------------------------------------------------------------

chi_features_res_new <-
  chi_features_set |>
  option_add(param_info = pca_param, id = "plus_pca_lm") |>
  workflow_map(resamples = time_val_split, grid = 21, seed = 1, verbose = TRUE)

chi_features_res_new

workflowsets documentation built on June 8, 2025, 10:12 a.m.