pipe_one_hot_encode: Train one-hot encoding

Description Usage Arguments Value

Description

Train one-hot encoding

Usage

1
2
3
4
pipe_one_hot_encode(train,
  columns = colnames(train)[purrr::map_lgl(train, function(x)
  return(!(is.numeric(x) || is.logical(x))))], stat_functions, response,
  quantile_trim_threshold = 0, use_pca = FALSE, pca_tol = 0.1)

Arguments

train

The train dataset, as a data.frame or data.table. Data.tables may be changed by reference.

columns

Columns from train to use for one-hot-encoding. Will automatically check if theses are column names in train

stat_functions

A (named) list of functions for when you want to use mean-encoding. Don't set it if you want to do regular one-hot encoding. Any function that return a single value from a scalar would do (e.g. quantile, sd).

response

String denoting the name of the column that should be used as the response variable. Mandatory

quantile_trim_threshold

Sets quantile_trim_threshold for pipe_create_stats if you provided stat_functions

use_pca

Whether PCA transformation is required.

pca_tol

The tol of prcomp

Value

A list containing the transformed train dataset and a trained pipe.


jeroenvdhoven/datapiper documentation built on July 14, 2019, 9:34 p.m.