NEWS.md

llamaR 0.2.4

Streaming generation

OpenAI-compatible server

ellmer integration

Bug fixes

llamaR 0.2.3

Context getters

Bug fixes

Logits

llamaR 0.2.2

ragnar integration

Batch embeddings

Context embedding mode

Backend & device selection

Hardware & system

Tests

llamaR 0.2.1

New functions

Bug fixes

Tests

llamaR 0.2.0

Hugging Face integration

New functions

Dependencies

llamaR 0.1.3

GPU and build system improvements

Vulkan GPU support on Windows

CRAN compliance

Dependencies

DESCRIPTION

llamaR 0.1.2

CRAN compliance fixes

Documentation

DESCRIPTION

Packaging

llamaR 0.1.1

R interface — first working release

Full LLM inference cycle is now available from R:

Memory management

Model and context are wrapped as ExternalPtr with automatic GC finalizers. The context holds a reference to the model ExternalPtr, preventing premature collection.

Generation internals

llama_generate() runs the full pipeline in a single C++ call: prompt tokenization → encode → autoregressive decode loop with a sampler chain → detokenization of generated tokens.

Tests

19 assertions across 7 test blocks, all passing.

llamaR 0.1.0

Initial Release

Dependencies

Known Limitations



Try the llamaR package in your browser

Any scripts or data that you put into this service are public.

llamaR documentation built on May 28, 2026, 1:06 a.m.