llama_get_embeddings: Get all output token embeddings as a matrix
In llamaR: Interface for Large Language Models via 'llama.cpp'

llama_get_embeddings

R Documentation

Get all output token embeddings as a matrix

Description

Returns a matrix of shape n_outputs × n_embd containing the raw embedding vectors for all tokens whose logits flag was set in the batch. Only works when pooling_type == "none" (generative models or embedding contexts without pooling). For pooled embeddings use [llama_get_embeddings_seq].

Usage

llama_get_embeddings(ctx, n_outputs)

Arguments

`ctx`	Context handle returned by [llama_new_context]
`n_outputs`	Number of outputs requested in the last decode call (i.e. how many tokens had `logits = TRUE` in the batch).