llama_gen_next: Pull the next chunk of a streaming generation
In llamaR: Interface for Large Language Models via 'llama.cpp'

llama_gen_next

R Documentation

Pull the next chunk of a streaming generation

Description

Advances a generation started with [llama_gen_begin] by one token and returns the next chunk of decoded text. A possibly-incomplete trailing UTF-8 character is held back until enough bytes arrive, so every returned chunk is valid UTF-8 (the chunk may be "" when the only new byte is part of an unfinished character).

Usage

llama_gen_next(state)

Arguments

state

Generation state handle from [llama_gen_begin].

Value

A length-1 UTF-8 character vector with the next chunk, or NULL when generation has finished (end-of-generation token reached or max_new_tokens exhausted). After NULL, call [llama_gen_end] to flush any remaining bytes.

llamaR
Interface for Large Language Models via 'llama.cpp'

llama_gen_next: Pull the next chunk of a streaming generation
In llamaR: Interface for Large Language Models via 'llama.cpp'

Pull the next chunk of a streaming generation

Description

Usage

Arguments

Value

See Also

Related to llama_gen_next in llamaR...

R Package Documentation

Browse R Packages

We want your feedback!

llamaR Interface for Large Language Models via 'llama.cpp'

llama_gen_next: Pull the next chunk of a streaming generation In llamaR: Interface for Large Language Models via 'llama.cpp'

Pull the next chunk of a streaming generation

Description

Usage

Arguments

Value

See Also

Related to llama_gen_next in llamaR...

R Package Documentation

Browse R Packages

We want your feedback!

llamaR
Interface for Large Language Models via 'llama.cpp'

llama_gen_next: Pull the next chunk of a streaming generation
In llamaR: Interface for Large Language Models via 'llama.cpp'