llamaR-package: llamaR: Interface for Large Language Models via 'llama.cpp'

llamaR-packageR Documentation

llamaR: Interface for Large Language Models via 'llama.cpp'

Description

Provides 'R' bindings to 'llama.cpp' for running Large Language Models ('LLMs') locally with optional 'Vulkan' GPU acceleration via 'ggmlR'. Supports model loading, text generation, 'tokenization', token-to-piece conversion, 'embeddings' (single and batch), encoder-decoder inference, low-level batch management, chat templates, 'LoRA' adapters, explicit backend/device selection, multi-GPU split, and 'NUMA' optimization. Includes a high-level 'ragnar'-compatible embedding provider ('embed_llamar'). Built on top of 'ggmlR' for efficient tensor operations.

Author(s)

Maintainer: Yuri Baramykov lbsbmsu@mail.ru (ORCID)

Other contributors:

  • Georgi Gerganov (Author of the 'llama.cpp' library included in src/) [copyright holder]

See Also

Useful links:


llamaR documentation built on May 28, 2026, 1:06 a.m.