| trainer_bpe | R Documentation |
BPE trainer
BPE trainer
tok::tok_trainer -> tok_trainer_bpe
new()Constrcutor for the BPE trainer
trainer_bpe$new( vocab_size = NULL, min_frequency = NULL, show_progress = NULL, special_tokens = NULL, limit_alphabet = NULL, initial_alphabet = NULL, continuing_subword_prefix = NULL, end_of_word_suffix = NULL, max_token_length = NULL )
vocab_sizeThe size of the final vocabulary, including all tokens and alphabet.
Default: NULL.
min_frequencyThe minimum frequency a pair should have in order to be merged.
Default: NULL.
show_progressWhether to show progress bars while training. Default: TRUE.
special_tokensA list of special tokens the model should be aware of.
Default: NULL.
limit_alphabetThe maximum number of different characters to keep in the alphabet.
Default: NULL.
initial_alphabetA list of characters to include in the initial alphabet,
even if not seen in the training dataset. Default: NULL.
continuing_subword_prefixA prefix to be used for every subword that is not a beginning-of-word.
Default: NULL.
end_of_word_suffixA suffix to be used for every subword that is an end-of-word.
Default: NULL.
max_token_lengthPrevents creating tokens longer than the specified size.
Default: NULL.
clone()The objects of this class are cloneable with this method.
trainer_bpe$clone(deep = FALSE)
deepWhether to make a deep clone.
Other trainer:
tok_trainer,
trainer_unigram,
trainer_wordpiece
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.