embed_llamar: Embedding provider for ragnar / standalone use

View source: R/embed.R

embed_llamarR Documentation

Embedding provider for ragnar / standalone use

Description

Computes embeddings using a local GGUF model. When called without x, returns a function suitable for passing to ragnar_store_create(embed = ...).

Usage

embed_llamar(
  x,
  model,
  n_gpu_layers = 0L,
  n_ctx = 512L,
  n_threads = parallel::detectCores(),
  embedding = FALSE,
  normalize = TRUE
)

Arguments

x

Character vector of texts to embed, a data.frame with a text column, or missing/NULL for partial application.

model

Either a path to a .gguf file (character) or a model handle already loaded via llama_load_model.

n_gpu_layers

Number of layers to offload to GPU (0 = CPU only, -1 = all). Ignored when model is an already-loaded handle.

n_ctx

Context window size for the embedding context. Defaults to 512, typical for embedding models. Ignored when model is an already-loaded handle.

n_threads

Number of CPU threads. Ignored when model is an already-loaded handle.

embedding

Logical; if TRUE, use pooled batch decode (efficient for true embedding models like nomic-embed, bge). If FALSE (default), use sequential last-token decode (works with any model).

normalize

Logical; if TRUE (default), L2-normalize each embedding vector.

Value

  • If x is missing or NULL: a function function(x) that returns a list of numeric vectors (one per input string), suitable for ragnar.

  • If x is a character vector: a numeric matrix with nrow = length(x) and ncol = n_embd.

  • If x is a data.frame: the same data.frame with an added embedding column (list of numeric vectors).

Examples

## Not run: 
# --- Partial application for ragnar ---
store <- ragnar_store_create(
  "my_store",
  embed = embed_llamar(model = "embedding-model.gguf", n_gpu_layers = -1)
)

# --- Direct use with path ---
mat <- embed_llamar(c("hello", "world"), model = "embedding-model.gguf")

# --- Direct use with pre-loaded model ---
mdl <- llama_load_model("embedding-model.gguf", n_gpu_layers = -1)
mat <- embed_llamar(c("hello", "world"), model = mdl)

## End(Not run)

llamaR documentation built on May 28, 2026, 1:06 a.m.