map.categorical.encoding: Categorical mapping tables

Description Usage Arguments Value Author(s) Examples

View source: R/map_categorical_encoding.R

Description

Creates a list of mapping tables, one for each categorical feature in the dataset. These tables include engineered features which can then be joined back to the original dataset. Feature engineering techniques include: one hot encoding, ordinal proporitonal encoding, weighted noise target mean encoding given parameter y is provided.

Usage

1
2
3
map.categorical.encoding(data, x, y = NULL, max.levels = 10,
  min.percent = 0.025, track.features = TRUE, seed = 1,
  progress = TRUE)

Arguments

data

[required | data.frame] Dataset containing categorical features

x

[required | character] A vector of categorical feature names present in the dataset

y

[optional | character | default=NULL] The name of the target feature contained in the dataset. If no target is provided mean target encoding will not be calculated.

max.levels

[optional | integer | default=10] The maximum levels allowed for a categorical feature to create one hot encoded features

min.percent

[optional | numeric | default=0.025] The minimum proportion a categorical level is allowed to have before it is flagged as a low proportional level

track.features

[optional | logical | default=TRUE] Creates tracking features that records which categories had low proportional values present

seed

[optional | integer | default=1] The random number seed for reproducable results

progress

[optional | logical | default=TRUE] Display a progress bar

Value

List of data frames containing engineered mapping features

Author(s)

Xander Horn

Examples

1
res <- map.categorical.encoding(data = iris, x = "Species", y = "Sepal.Length")

XanderHorn/lazy documentation built on Jan. 16, 2021, 6:15 p.m.