impute: Impute data and return a reusable recipe

View source: R/impute.R

imputeR Documentation

Impute data and return a reusable recipe

Description

impute will impute your data using a variety of methods for both nominal and numeric data. Currently supports mean (numeric only), new_category (categorical only), bagged trees, or knn.

Usage

impute(
  d = NULL,
  ...,
  recipe = NULL,
  numeric_method = "mean",
  nominal_method = "new_category",
  numeric_params = NULL,
  nominal_params = NULL,
  verbose = FALSE
)

Arguments

d

A dataframe or tibble containing data to impute.

...

Optional. Unquoted variable names to not be imputed. These will be returned unaltered.

recipe

Optional, a recipe object or an imputed data frame (containing a recipe object as an attribute). If provided, this recipe will be applied to impute new data contained in d with values saved in the recipe. Use this param if you'd like to apply the same values used for imputation on a training dataset in production.

numeric_method

Defaults to "mean". Other choices are "bagimpute" or "knnimpute".

nominal_method

Defaults to "new_category". Other choices are "bagimpute" or "knnimpute".

numeric_params

A named list with parmeters to use with chosen imputation method on numeric data. Options are bag_model (bagimpute only), bag_trees (bagimpute only), bag_options (bagimpute only), bag_trees (bagimpute only), knn_K (knnimpute only), impute_with (knnimpute only), (bag or knn) or seed_val (bag or knn). See step_bagimpute or step_knnimpute for details.

nominal_params

A named list with parmeters to use with chosen imputation method on nominal data. Options are bag_model (bagimpute only), bag_trees (bagimpute only), bag_options (bagimpute only), bag_trees (bagimpute only), knn_K (knnimpute only), impute_with (knnimpute only), (bag or knn) or seed_val (bag or knn). See step_bagimpute or step_knnimpute for details.

verbose

Gives a print out of what will be imputed and which method will be used.

Value

Imputed data frame with reusable recipe object for future imputation in attribute "recipe".

Examples

d <- pima_diabetes
d_train <- d[1:700, ]
d_test <- d[701:768, ]
# Train imputer
train_imputed <- impute(d = d_train, patient_id, diabetes)
# Apply to new data
impute(d = d_test, patient_id, diabetes, recipe = train_imputed)
# Specify methods:
impute(d = d_train, patient_id, diabetes, numeric_method = "bagimpute",
nominal_method = "new_category")
# Specify method and param:
impute(d = d_train, patient_id, diabetes, nominal_method = "knnimpute",
nominal_params = list(knn_K = 4))

healthcareai documentation built on Sept. 5, 2022, 5:12 p.m.