llamaR: Interface for Large Language Models via 'llama.cpp'

Provides 'R' bindings to 'llama.cpp' for running Large Language Models ('LLMs') locally with optional 'Vulkan' GPU acceleration via 'ggmlR'. Supports model loading, text generation, 'tokenization', token-to-piece conversion, 'embeddings' (single and batch), encoder-decoder inference, low-level batch management, chat templates, 'LoRA' adapters, explicit backend/device selection, multi-GPU split, and 'NUMA' optimization. Includes a high-level 'ragnar'-compatible embedding provider ('embed_llamar'). Built on top of 'ggmlR' for efficient tensor operations.

Package overview README.md Chat and Agents Getting Started with llamaR

Vignettes Man pages API and functions Files

Package details
Author	Yuri Baramykov [aut, cre] (ORCID: <https://orcid.org/0009-0000-7627-4217>), Georgi Gerganov [cph] (Author of the 'llama.cpp' library included in src/)
Maintainer	Yuri Baramykov <lbsbmsu@mail.ru>
License	MIT + file LICENSE
Version	0.2.4
URL	https://github.com/Zabis13/llamaR
Package repository	View on CRAN
Installation	Install the latest version of this package by entering the following in R: `install.packages("llamaR")`