nn_tokenizer_categ: Categorical Tokenizer

nn_tokenizer_categR Documentation

Categorical Tokenizer

Description

Tokenizes categorical features into a dense embedding. For an input of shape ⁠(batch, n_features)⁠ the output shape is ⁠(batch, n_features, d_token)⁠.

Usage

nn_tokenizer_categ(cardinalities, d_token, bias, initialization)

Arguments

cardinalities

(integer())
The number of categories for each feature.

d_token

(integer(1))
The dimension of the embedding.

bias

(logical(1))
Whether to use a bias.

initialization

(character(1))
The initialization method for the embedding weights. Possible values are "uniform" and "normal".

References

Gorishniy Y, Rubachev I, Khrulkov V, Babenko A (2021). “Revisiting Deep Learning for Tabular Data.” arXiv, 2106.11959.


mlr-org/mlr3torch documentation built on April 17, 2025, 8:22 p.m.