embed_llamar: Embedding provider for ragnar / standalone use
In llamaR: Interface for Large Language Models via 'llama.cpp'

embed_llamar

R Documentation

Embedding provider for ragnar / standalone use

Description

Computes embeddings using a local GGUF model. When called without x, returns a function suitable for passing to ragnar_store_create(embed = ...).

Usage

embed_llamar(
  x,
  model,
  n_gpu_layers = 0L,
  n_ctx = 512L,
  n_threads = parallel::detectCores(),
  embedding = FALSE,
  normalize = TRUE
)

Arguments

`x`	Character vector of texts to embed, a data.frame with a `text` column, or missing/`NULL` for partial application.
`model`	Either a path to a `.gguf` file (character) or a model handle already loaded via `llama_load_model`.
`n_gpu_layers`	Number of layers to offload to GPU (0 = CPU only, -1 = all). Ignored when `model` is an already-loaded handle.
`n_ctx`	Context window size for the embedding context. Defaults to 512, typical for embedding models. Ignored when `model` is an already-loaded handle.
`n_threads`	Number of CPU threads. Ignored when `model` is an already-loaded handle.
`embedding`	Logical; if `TRUE`, use pooled batch decode (efficient for true embedding models like nomic-embed, bge). If `FALSE` (default), use sequential last-token decode (works with any model).
`normalize`	Logical; if `TRUE` (default), L2-normalize each embedding vector.

Value

If x is missing or NULL: a function function(x) that returns a list of numeric vectors (one per input string), suitable for ragnar.
If x is a character vector: a numeric matrix with nrow = length(x) and ncol = n_embd.
If x is a data.frame: the same data.frame with an added embedding column (list of numeric vectors).

Examples

## Not run: 
# --- Partial application for ragnar ---
store <- ragnar_store_create(
  "my_store",
  embed = embed_llamar(model = "embedding-model.gguf", n_gpu_layers = -1)
)

# --- Direct use with path ---
mat <- embed_llamar(c("hello", "world"), model = "embedding-model.gguf")

# --- Direct use with pre-loaded model ---
mdl <- llama_load_model("embedding-model.gguf", n_gpu_layers = -1)
mat <- embed_llamar(c("hello", "world"), model = mdl)

## End(Not run)

llamaR documentation built on May 28, 2026, 1:06 a.m.