tokenize_lst: Tokenize an input
In pangoling: Access to Large Language Model Predictions

tokenize_lst

R Documentation

Tokenize an input

Description

Tokenize a string or token ids.

Usage

tokenize_lst(
  x,
  decode = FALSE,
  model = getOption("pangoling.causal.default"),
  add_special_tokens = NULL,
  config_tokenizer = NULL
)

Arguments

`x`	Strings or token ids.
`decode`	Logical. If `TRUE`, decodes the tokens into human-readable strings, handling special characters and diacritics. Default is `FALSE`.
`model`	Name of a pre-trained model or folder. One should be able to use models based on "gpt2". See hugging face website.
`add_special_tokens`	Whether to include special tokens. It has the same default as the AutoTokenizer method in Python.
`config_tokenizer`	List with other arguments that control how the tokenizer from Hugging Face is accessed.

Value

A list with tokens

Examples


tokenize_lst(x = c("The apple doesn't fall far from the tree."), 
             model = "gpt2")

pangoling documentation built on April 11, 2025, 6:16 p.m.

pangoling index

Package overview Troubleshooting the use of Python in R Using a Bert model to get the predictability of words in their context Using a GPT2 transformer model to get word predictability Worked-out example: Surprisal from a causal (GPT) model as a cognitive processing bottleneck in reading

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pangoling
Access to Large Language Model Predictions

tokenize_lst: Tokenize an input
In pangoling: Access to Large Language Model Predictions

Tokenize an input

Description

Usage

Arguments

Value

See Also

Examples

Related to tokenize_lst in pangoling...

R Package Documentation

Browse R Packages

We want your feedback!

pangoling Access to Large Language Model Predictions

tokenize_lst: Tokenize an input In pangoling: Access to Large Language Model Predictions

Tokenize an input

Description

Usage

Arguments

Value

See Also

Examples

Related to tokenize_lst in pangoling...

R Package Documentation

Browse R Packages

We want your feedback!

pangoling
Access to Large Language Model Predictions

tokenize_lst: Tokenize an input
In pangoling: Access to Large Language Model Predictions