model_bpe: BPE model
In tok: Fast Text Tokenization

model_bpe

R Documentation

BPE model

Description

BPE model

Super class

tok::tok_model -> tok_model_bpe

Methods

Public methods

model_bpe$new()
model_bpe$clone()

Method `new()`

Initializes a BPE model An implementation of the BPE (Byte-Pair Encoding) algorithm

Usage

model_bpe$new(
  vocab = NULL,
  merges = NULL,
  cache_capacity = NULL,
  dropout = NULL,
  unk_token = NULL,
  continuing_subword_prefix = NULL,
  end_of_word_suffix = NULL,
  fuse_unk = NULL,
  byte_fallback = FALSE
)

Arguments

vocab: A named integer vector of string keys and their corresponding ids. Default: NULL
merges: A list of pairs of tokens (⁠[character, character]⁠). Default: NULL.
cache_capacity: The number of words that the BPE cache can contain. The cache speeds up the process by storing merge operation results. Default: NULL.
dropout: A float between 0 and 1 representing the BPE dropout to use. Default: NULL
unk_token: The unknown token to be used by the model. Default: 'NULL“'.
continuing_subword_prefix: The prefix to attach to subword units that don’t represent the beginning of a word. Default: NULL
end_of_word_suffix: The suffix to attach to subword units that represent the end of a word. Default: NULL
fuse_unk: Whether to fuse any subsequent unknown tokens into a single one. Default: NULL.
byte_fallback: Whether to use the spm byte-fallback trick. Default: FALSE.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

model_bpe$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

tok
Fast Text Tokenization

model_bpe: BPE model
In tok: Fast Text Tokenization

BPE model

Description

Super class

Methods

Public methods

Method `new()`

Usage

Arguments

Method `clone()`

Usage

Arguments

See Also

Related to model_bpe in tok...

R Package Documentation

Browse R Packages

We want your feedback!

tok Fast Text Tokenization

model_bpe: BPE model In tok: Fast Text Tokenization

BPE model

Description

Super class

Methods

Public methods

Method new()

Usage

Arguments

Method clone()

Usage

Arguments

See Also

Related to model_bpe in tok...

R Package Documentation

Browse R Packages

We want your feedback!

tok
Fast Text Tokenization

model_bpe: BPE model
In tok: Fast Text Tokenization

Method `new()`

Method `clone()`