Man pages for tokenizers.bpe
Byte Pair Encoding Text Tokenization

belgium_parliamentDataset from 2017 with Questions asked in the Belgium Federal...
bpeConstruct a Byte Pair Encoding model
bpe_decodeDecode Byte Pair Encoding sequences to text
bpe_encodeTokenise text alongside a Byte Pair Encoding model
bpe_load_modelLoad a Byte Pair Encoding model
tokenizers.bpe documentation built on Sept. 16, 2023, 1:06 a.m.