| llama_memory_breakdown_print | R Documentation |
Prints a debug summary of how model weights are distributed across compute devices (CPU, GPU layers). Useful for diagnosing memory allocation with partial GPU offload.
llama_memory_breakdown_print(ctx)
ctx |
Context handle returned by [llama_new_context] |
No return value, called for side effects.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.