pipe_feature_transformer: Applies different transformations to each numeric feature and...

Description Usage Arguments Value

View source: R/transform_features.R

Description

Applies different transformations to each numeric feature and selects the one with the highest correlation to the response per feature

Usage

1
2
3
pipe_feature_transformer(train, response, transform_columns,
  missing_func = is.na, transform_functions = list(sqrt, log,
  function(x) x^2), retransform_columns)

Arguments

train

Data frame containing the train data.

response

String denoting the name of the column that should be used as the response variable. Mandatory

transform_columns

Columns to consider for transformation.

missing_func

A function that determines if an observation is missing, e.g. is.na, is.nan, or function(x) x == 0. Defaults to is.na. Set to NA to skip NA feature generation and imputation.

transform_functions

A list of function used to transform features. Should be one-to-one transformations like sqrt or log.

retransform_columns

Columns that should be retransformed later on. If this is set to one or more column names, this function will generate a numerical approximation of the inverse of the optimal tranformation function. This pipe will be returned as a separate list entry.

Value

A list containing the transformed train dataset and a trained pipe. If retransform_columns was set, the reverting pipe will also be provided.


jeroenvdhoven/datapiper documentation built on July 14, 2019, 9:34 p.m.