tokenize_lst | R Documentation |
Tokenize a string or token ids.
tokenize_lst(
x,
decode = FALSE,
model = getOption("pangoling.causal.default"),
add_special_tokens = NULL,
config_tokenizer = NULL
)
x |
Strings or token ids. |
decode |
Logical. If |
model |
Name of a pre-trained model or folder. One should be able to use models based on "gpt2". See hugging face website. |
add_special_tokens |
Whether to include special tokens. It has the same default as the AutoTokenizer method in Python. |
config_tokenizer |
List with other arguments that control how the tokenizer from Hugging Face is accessed. |
A list with tokens
Other token-related functions:
ntokens()
,
transformer_vocab()
tokenize_lst(x = c("The apple doesn't fall far from the tree."),
model = "gpt2")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.