trainer_bpe: BPE trainer
In tok: Fast Text Tokenization

trainer_bpe

R Documentation

BPE trainer

Description

BPE trainer

Super class

tok::tok_trainer -> tok_trainer_bpe

Methods

Public methods

trainer_bpe$new()
trainer_bpe$clone()

Method `new()`

Constrcutor for the BPE trainer

Usage

trainer_bpe$new(
  vocab_size = NULL,
  min_frequency = NULL,
  show_progress = NULL,
  special_tokens = NULL,
  limit_alphabet = NULL,
  initial_alphabet = NULL,
  continuing_subword_prefix = NULL,
  end_of_word_suffix = NULL,
  max_token_length = NULL
)

Arguments

vocab_size: The size of the final vocabulary, including all tokens and alphabet. Default: NULL.
min_frequency: The minimum frequency a pair should have in order to be merged. Default: NULL.
show_progress: Whether to show progress bars while training. Default: TRUE.
special_tokens: A list of special tokens the model should be aware of. Default: NULL.
limit_alphabet: The maximum number of different characters to keep in the alphabet. Default: NULL.
initial_alphabet: A list of characters to include in the initial alphabet, even if not seen in the training dataset. Default: NULL.
continuing_subword_prefix: A prefix to be used for every subword that is not a beginning-of-word. Default: NULL.
end_of_word_suffix: A suffix to be used for every subword that is an end-of-word. Default: NULL.
max_token_length: Prevents creating tokens longer than the specified size. Default: NULL.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

trainer_bpe$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

tok
Fast Text Tokenization

trainer_bpe: BPE trainer
In tok: Fast Text Tokenization

BPE trainer

Description

Super class

Methods

Public methods

Method `new()`

Usage

Arguments

Method `clone()`

Usage

Arguments

See Also

Related to trainer_bpe in tok...

R Package Documentation

Browse R Packages

We want your feedback!

tok Fast Text Tokenization

trainer_bpe: BPE trainer In tok: Fast Text Tokenization

BPE trainer

Description

Super class

Methods

Public methods

Method new()

Usage

Arguments

Method clone()

Usage

Arguments

See Also

Related to trainer_bpe in tok...

R Package Documentation

Browse R Packages

We want your feedback!

tok
Fast Text Tokenization

trainer_bpe: BPE trainer
In tok: Fast Text Tokenization

Method `new()`

Method `clone()`