Man pages for tokenizers.bpe
Byte Pair Encoding Text Tokenization

belgium_parliamentDataset from 2017 with Questions asked in the Belgium Federal...
bpeConstruct a Byte Pair Encoding model
bpe_decodeDecode Byte Pair Encoding sequences to text
bpe_encodeTokenise text alongside a Byte Pair Encoding model
bpe_load_modelLoad a Byte Pair Encoding model
tokenizers.bpe documentation built on May 21, 2026, 1:06 a.m.