View source: R/impute_median.R
step_impute_median | R Documentation |
step_impute_median()
creates a specification of a recipe step that will
substitute missing values of numeric variables by the training set median of
those variables.
step_impute_median(
recipe,
...,
role = NA,
trained = FALSE,
medians = NULL,
skip = FALSE,
id = rand_id("impute_median")
)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose variables
for this step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
medians |
A named numeric vector of medians. This is |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
step_impute_median
estimates the variable medians from the data
used in the training
argument of prep.recipe
. bake.recipe
then applies
the new values to new data sets using these medians.
As of recipes
0.1.16, this function name changed from
step_medianimpute()
to step_impute_median()
.
An updated version of recipe
with the new step added to the
sequence of any existing operations.
When you tidy()
this step, a tibble is returned with
columns terms
, value
, and id
:
character, the selectors or variables selected
numeric, the median value
character, id of this step
This step performs an unsupervised operation that can utilize case weights.
As a result, case weights are only used with frequency weights. For more
information, see the documentation in case_weights and the examples on
tidymodels.org
.
Other imputation steps:
step_impute_bag()
,
step_impute_knn()
,
step_impute_linear()
,
step_impute_lower()
,
step_impute_mean()
,
step_impute_mode()
,
step_impute_roll()
data("credit_data", package = "modeldata")
## missing data per column
vapply(credit_data, function(x) mean(is.na(x)), c(num = 0))
set.seed(342)
in_training <- sample(1:nrow(credit_data), 2000)
credit_tr <- credit_data[in_training, ]
credit_te <- credit_data[-in_training, ]
missing_examples <- c(14, 394, 565)
rec <- recipe(Price ~ ., data = credit_tr)
impute_rec <- rec %>%
step_impute_median(Income, Assets, Debt)
imp_models <- prep(impute_rec, training = credit_tr)
imputed_te <- bake(imp_models, new_data = credit_te)
credit_te[missing_examples, ]
imputed_te[missing_examples, names(credit_te)]
tidy(impute_rec, number = 1)
tidy(imp_models, number = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.