fit_text_tokenizer: Update tokenizer internal vocabulary based on a list of texts...

View source: R/preprocessing.R

fit_text_tokenizerR Documentation

Update tokenizer internal vocabulary based on a list of texts or list of sequences.

Description

Update tokenizer internal vocabulary based on a list of texts or list of sequences.

Usage

fit_text_tokenizer(object, x)

Arguments

object

Tokenizer returned by text_tokenizer()

x

Vector/list of strings, or a generator of strings (for memory-efficiency); Alternatively a list of "sequence" (a sequence is a list of integer word indices).

Note

Required before using texts_to_sequences(), texts_to_matrix(), or sequences_to_matrix().

See Also

Other text tokenization: save_text_tokenizer(), sequences_to_matrix(), text_tokenizer(), texts_to_matrix(), texts_to_sequences(), texts_to_sequences_generator()


keras documentation built on May 29, 2024, 3:20 a.m.