h2o.target_encode_create: Create Target Encoding Map

View source: R/frame.R

h2o.target_encode_createR Documentation

Create Target Encoding Map

Description

Creates a target encoding map based on group-by columns ('x') and a numeric or binary target column ('y'). Computing target encoding for high cardinality categorical columns can improve performance of supervised learning models. A Target Encoding tutorial is available here: https://github.com/h2oai/h2o-tutorials/blob/master/best-practices/categorical-predictors/target_encoding.md.

Usage

h2o.target_encode_create(data, x, y, fold_column = NULL)

Arguments

data

An H2OFrame object with which to create the target encoding map.

x

A list containing the names or indices of the variables to encode. A target encoding map will be created for each element in the list. Items in the list can be multiple columns. For example, if 'x = list(c("A"), c("B", "C"))', then there will be one mapping frame for A and one mapping frame for B & C (in this case, we group by two columns).

y

The name or column index of the response variable in the data. The response variable can be either numeric or binary.

fold_column

(Optional) The name or column index of the fold column in the data. Defaults to NULL (no 'fold_column').

Value

Returns a list of H2OFrame objects containing the target encoding mapping for each column in 'x'.

See Also

h2o.target_encode_apply for applying the target encoding mapping to a frame.

Examples

## Not run: 
library(h2o)
h2o.init()

# Get Target Encoding Map on bank-additional-full data with numeric response
data <- h2o.importFile(
path = "https://s3.amazonaws.com/h2o-public-test-data/smalldata/demos/bank-additional-full.csv")
mapping_age <- h2o.target_encode_create(data = data, x = list(c("job"), c("job", "marital")), 
                                        y = "age")
head(mapping_age)

# Get Target Encoding Map on bank-additional-full data with binary response
mapping_y <- h2o.target_encode_create(data = data, x = list(c("job"), c("job", "marital")), 
                                      y = "y")
head(mapping_y)


## End(Not run)

h2o documentation built on Aug. 9, 2023, 9:06 a.m.