llama_gen_next: Pull the next chunk of a streaming generation

View source: R/llama.R

llama_gen_nextR Documentation

Pull the next chunk of a streaming generation

Description

Advances a generation started with [llama_gen_begin] by one token and returns the next chunk of decoded text. A possibly-incomplete trailing UTF-8 character is held back until enough bytes arrive, so every returned chunk is valid UTF-8 (the chunk may be "" when the only new byte is part of an unfinished character).

Usage

llama_gen_next(state)

Arguments

state

Generation state handle from [llama_gen_begin].

Value

A length-1 UTF-8 character vector with the next chunk, or NULL when generation has finished (end-of-generation token reached or max_new_tokens exhausted). After NULL, call [llama_gen_end] to flush any remaining bytes.

See Also

[llama_gen_begin], [llama_gen_end]


llamaR documentation built on May 28, 2026, 1:06 a.m.