arete: Automated REtrieval from TExt

A Python based pipeline for extraction of species occurrence data through the usage of large language models. Includes validation tools designed to handle model hallucinations for a scientific, rigorous use of LLM. Currently supports usage of GPT with more planned, including local and non-proprietary models. For more details on the methodology used please consult the references listed under each function, such as Kent, A. et al. (1995) <doi:10.1002/asi.5090060209>, van Rijsbergen, C.J. (1979, ISBN:978-0408709293, Levenshtein, V.I. (1966) <https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf> and Klaus Krippendorff (2011) <https://repository.upenn.edu/handle/20.500.14332/2089>.

Package workflow

Vignettes Man pages API and functions Files

Package details
Author	Vasco V. Branco [cre, aut] (ORCID: <https://orcid.org/0000-0001-7797-3183>), Vaughn Shirey [ctb] (ORCID: <https://orcid.org/0000-0002-3589-9699>), Thomas Merrien [ctb] (ORCID: <https://orcid.org/0000-0002-0339-5656>), Pedro Cardoso [aut] (ORCID: <https://orcid.org/0000-0001-8119-9960>)
Maintainer	Vasco V. Branco <vasco.branco@helsinki.fi>
License	GPL-3
Version	0.1
Package repository	View on CRAN
Installation	Install the latest version of this package by entering the following in R: `install.packages("arete")`