llama_get_embeddings: Get all output token embeddings as a matrix

View source: R/llama.R

llama_get_embeddingsR Documentation

Get all output token embeddings as a matrix

Description

Returns a matrix of shape n_outputs × n_embd containing the raw embedding vectors for all tokens whose logits flag was set in the batch. Only works when pooling_type == "none" (generative models or embedding contexts without pooling). For pooled embeddings use [llama_get_embeddings_seq].

Usage

llama_get_embeddings(ctx, n_outputs)

Arguments

ctx

Context handle returned by [llama_new_context]

n_outputs

Number of outputs requested in the last decode call (i.e. how many tokens had logits = TRUE in the batch).

Value

A numeric matrix with n_outputs rows and n_embd columns.


llamaR documentation built on May 28, 2026, 1:06 a.m.