| model_bpe | R Documentation |
BPE model
BPE model
tok::tok_model -> tok_model_bpe
new()Initializes a BPE model An implementation of the BPE (Byte-Pair Encoding) algorithm
model_bpe$new( vocab = NULL, merges = NULL, cache_capacity = NULL, dropout = NULL, unk_token = NULL, continuing_subword_prefix = NULL, end_of_word_suffix = NULL, fuse_unk = NULL, byte_fallback = FALSE )
vocabA named integer vector of string keys and their corresponding ids. Default: NULL
mergesA list of pairs of tokens ([character, character]). Default: NULL.
cache_capacityThe number of words that the BPE cache can contain.
The cache speeds up the process by storing merge operation results. Default: NULL.
dropoutA float between 0 and 1 representing the BPE dropout to use. Default: NULL
unk_tokenThe unknown token to be used by the model. Default: 'NULL“'.
continuing_subword_prefixThe prefix to attach to subword units that don’t
represent the beginning of a word. Default: NULL
end_of_word_suffixThe suffix to attach to subword units that represent
the end of a word. Default: NULL
fuse_unkWhether to fuse any subsequent unknown tokens into a single one. Default: NULL.
byte_fallbackWhether to use the spm byte-fallback trick. Default: FALSE.
clone()The objects of this class are cloneable with this method.
model_bpe$clone(deep = FALSE)
deepWhether to make a deep clone.
Other model:
model_unigram,
model_wordpiece,
tok_model
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.