Provides access to word predictability estimates using large language models (LLMs) based on 'transformer' architectures via integration with the 'Hugging Face' ecosystem <https://huggingface.co/>. The package interfaces with pre-trained neural networks and supports both causal/auto-regressive LLMs (e.g., 'GPT-2') and masked/bidirectional LLMs (e.g., 'BERT') to compute the probability of words, phrases, or tokens given their linguistic context. For details on GPT-2 and causal models, see Radford et al. (2019) <https://storage.prod.researchhub.com/uploads/papers/2020/06/01/language-models.pdf>, for details on BERT and masked models, see Devlin et al. (2019) <doi:10.48550/arXiv.1810.04805>. By enabling a straightforward estimation of word predictability, the package facilitates research in psycholinguistics, computational linguistics, and natural language processing (NLP).
Package details |
|
---|---|
Author | Bruno Nicenboim [aut, cre] (<https://orcid.org/0000-0002-5176-3943>), Chris Emmerly [ctb], Giovanni Cassani [ctb], Lisa Levinson [rev], Utku Turk [rev] |
Maintainer | Bruno Nicenboim <b.nicenboim@tilburguniversity.edu> |
License | MIT + file LICENSE |
Version | 1.0.3 |
URL | https://docs.ropensci.org/pangoling/ https://github.com/ropensci/pangoling |
Package repository | View on CRAN |
Installation |
Install the latest version of this package by entering the following in R:
|
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.