bake | R Documentation |
For a recipe with at least one preprocessing operation that has been trained by
prep()
, apply the computations to new data.
bake(object, ...)
## S3 method for class 'recipe'
bake(object, new_data, ..., composition = "tibble")
object |
A trained object such as a |
... |
One or more selector functions to choose which variables will be
returned by the function. See |
new_data |
A data frame or tibble for whom the preprocessing will be
applied. If |
composition |
Either "tibble", "matrix", "data.frame", or "dgCMatrix" for the format of the processed data set. Note that all computations during the baking process are done in a non-sparse format. Also, note that this argument should be called after any selectors and the selectors should only resolve to numeric columns (otherwise an error is thrown). |
bake()
takes a trained recipe and applies its operations to a
data set to create a design matrix. If you are using a recipe as a
preprocessor for modeling, we highly recommend that you use a workflow()
instead of manually applying a recipe (see the example in recipe()
).
If the data set is not too large, time can be saved by using the
retain = TRUE
option of prep()
. This stores the processed version of the
training set. With this option set, bake(object, new_data = NULL)
will return it for free.
Also, any steps with skip = TRUE
will not be applied to the
data when bake()
is invoked with a data set in new_data
.
bake(object, new_data = NULL)
will always have all of the steps applied.
A tibble, matrix, or sparse matrix that may have different
columns than the original columns in new_data
.
recipe()
, prep()
data(ames, package = "modeldata")
ames <- mutate(ames, Sale_Price = log10(Sale_Price))
ames_rec <-
recipe(Sale_Price ~ ., data = ames[-(1:6), ]) %>%
step_other(Neighborhood, threshold = 0.05) %>%
step_dummy(all_nominal()) %>%
step_interact(~ starts_with("Central_Air"):Year_Built) %>%
step_ns(Longitude, Latitude, deg_free = 2) %>%
step_zv(all_predictors()) %>%
prep()
# return the training set (already embedded in ames_rec)
bake(ames_rec, new_data = NULL)
# apply processing to other data:
bake(ames_rec, new_data = head(ames))
# only return selected variables:
bake(ames_rec, new_data = head(ames), all_numeric_predictors())
bake(ames_rec, new_data = head(ames), starts_with(c("Longitude", "Latitude")))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.