pipe_impute: Impute multiple missing columns using lm, mean, or xgboost,...

Description Usage Arguments Value

Description

Impute multiple missing columns using lm, mean, or xgboost, and perform imputation

Usage

1
2
pipe_impute(train, columns, na_function = is.na, exclude_columns,
  type = "lm", controls = NA, verbose = F)

Arguments

train

The train dataset, as a data.frame or data.table. Data.tables may be changed by reference.

columns

The columns to impute, as strings.

na_function

A function which returns TRUE when a value is missing and FALSE otherwise. Will apply this function to each column. Must take one column vector as input.

exclude_columns

Columns that should not be used in imputation. If lm is chosen, this will always include columns. Should be strings.

type

lm, mean, or xgboost.

controls

Controls for xgboost, if needed. Default to NA.

verbose

Whether xgboost should print anything.

Value

A list containing the transformed train dataset and a trained pipe.


jeroenvdhoven/datapiper documentation built on July 14, 2019, 9:34 p.m.