recipes_ptype: Prototype of recipe object

View source: R/ptype.R

recipes_ptypeR Documentation

Prototype of recipe object

Description

This helper function returns the prototype of the input data set expected by the recipe object.

Usage

recipes_ptype(x, ..., stage = "prep")

Arguments

x

A recipe object.

...

currently not used.

stage

A single character. Must be one of "prep" or "bake". See details for more. Defaults to "prep".

Details

The returned ptype is a tibble of the data set that the recipe object is expecting. The specifics of which columns depend on the stage.

At prep() time, when stage = "prep", the ptype is the data passed to recipe(). The following code chunk represents a possible recipe scenario. recipes_ptype(rec_spec, stage = "prep") and recipes_ptype(rec_prep, stage = "prep") both return a ptype tibble corresponding to data_ptype. This information is used internally in prep() to verify that data_training has the right columns with the right types.

rec_spec <- recipe(outcome ~ ., data = data_ptype) %>%
  step_normalize(all_numeric_predictors()) %>%
  step_dummy(all_nominal_predictors()) 

rec_prep <- prep(rec_spec, training = data_training)

At bake() time, when stage = "bake", the ptype represents the data that are required for bake() to run.

data_bake <- bake(rec_prep, new_data = data_testing)

What this means in practice is that unless otherwise specified, everything but outcomes and case weights are required. These requirements can be changed with update_role_requirements(), and recipes_ptype() respects those changes.

recipes_ptype() returns NULL on recipes created prior to version 1.1.0.

Note that the order of the columns aren't guaranteed to align with data_ptype as the data internally is ordered according to roles.

Value

A zero row tibble.

See Also

developer_functions recipes_ptype_validate

Examples

training <- tibble(
  y = 1:10,
  id = 1:10,
  x1 = letters[1:10],
  x2 = factor(letters[1:10]),
  cw = hardhat::importance_weights(1:10)
)
training

rec_spec <- recipe(y ~ ., data = training)

# outcomes and case_weights are not required at bake time
recipes_ptype(rec_spec, stage = "prep")
recipes_ptype(rec_spec, stage = "bake")

rec_spec <- recipe(y ~ ., data = training) %>%
  update_role(x1, new_role = "id")

# outcomes and case_weights are not required at bake time
# "id" column is assumed to be needed
recipes_ptype(rec_spec, stage = "prep")
recipes_ptype(rec_spec, stage = "bake")

rec_spec <- recipe(y ~ ., data = training) %>%
  update_role(x1, new_role = "id") %>%
  update_role_requirements("id", bake = FALSE)

# update_role_requirements() is used to specify that "id" isn't needed
recipes_ptype(rec_spec, stage = "prep")
recipes_ptype(rec_spec, stage = "bake")


tidymodels/recipes documentation built on Nov. 29, 2024, 3:05 p.m.