model_bpe | R Documentation |
BPE model
BPE model
tok::tok_model
-> tok_model_bpe
new()
Initializes a BPE model An implementation of the BPE (Byte-Pair Encoding) algorithm
model_bpe$new( vocab = NULL, merges = NULL, cache_capacity = NULL, dropout = NULL, unk_token = NULL, continuing_subword_prefix = NULL, end_of_word_suffix = NULL, fuse_unk = NULL, byte_fallback = FALSE )
vocab
A named integer vector of string keys and their corresponding ids. Default: NULL
merges
A list of pairs of tokens ([character, character]
). Default: NULL
.
cache_capacity
The number of words that the BPE cache can contain.
The cache speeds up the process by storing merge operation results. Default: NULL.
dropout
A float between 0 and 1 representing the BPE dropout to use. Default: NULL
unk_token
The unknown token to be used by the model. Default: 'NULL“'.
continuing_subword_prefix
The prefix to attach to subword units that don’t
represent the beginning of a word. Default: NULL
end_of_word_suffix
The suffix to attach to subword units that represent
the end of a word. Default: NULL
fuse_unk
Whether to fuse any subsequent unknown tokens into a single one. Default: NULL
.
byte_fallback
Whether to use the spm byte-fallback trick. Default: FALSE
.
clone()
The objects of this class are cloneable with this method.
model_bpe$clone(deep = FALSE)
deep
Whether to make a deep clone.
Other model:
model_unigram
,
model_wordpiece
,
tok_model
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.