llama_load_model_hf: Load a model directly from Hugging Face
In llamaR: Interface for Large Language Models via 'llama.cpp'

llama_load_model_hf

R Documentation

Load a model directly from Hugging Face

Convenience function that downloads a GGUF model from Hugging Face (if not already cached) and loads it via llama_load_model.

llama_load_model_hf(repo_id, ..., n_gpu_layers = 0L)

`repo_id`	Character. Hugging Face repository in `"org/repo"` format.
`...`	Additional arguments passed to `llama_hf_download` (e.g. `pattern`, `cache_dir`, `force`).
`n_gpu_layers`	Integer. Number of layers to offload to GPU. Use `-1L` for all layers. Defaults to `0L` (CPU only).

An external pointer to the loaded model, as returned by llama_load_model.

## Not run: 
model <- llama_load_model_hf("TheBloke/Llama-2-7B-GGUF",
                              pattern = "*q2_k*")

## End(Not run)

llamaR documentation built on May 28, 2026, 1:06 a.m.

llamaR index

Note that we can't provide technical support on individual packages. You should contact the package authors for that.