frequencyEncode: frequencyEncode

Description Usage Arguments Value

View source: R/encodings.R

Description

Re-encodes categorical data as its frequency in the dataset. Useful for gradient boosting. Does NOT return dataset, but an object that can be applied to a dataset with the applyEncoding function. If your data contains missing values, be very careful with the encodeNA and allowNewLevels parameters.

Usage

1
frequencyEncode(dt, vars, encodeNA = FALSE, allowNewLevels = FALSE)

Arguments

dt

data.frame(table) to create the object on

vars

vector of variables you want to frequency-encode

encodeNA

Boolean. Should NAs be encodes as a frequency, or kept as NA when the transformation is applied? If there are no NAs in your original data, new NAs will still be encoded as 1. Risky, but easy.

allowNewLevels

Should any new levels be encoded as -1? Details:

  • TRUE Encodes new levels as -1. This is dangerous if your levels can change in the future, because you won't notice, and the model may not be tuned correctly.

  • FALSE Throws an error. You'll need to figure out how you want to proceed if allowNewLevels = TRUE is not good enough.

Value

Frequency Encoded Object. This needs to be applied to a dataset, it will not actually return a dataset.


AnotherSamWilson/helperFuncs documentation built on Oct. 1, 2019, 8:51 p.m.