| rag | R Documentation |
Performs retrieval-augmented generation {llama-index}
Supports multiple local LLM backends via HuggingFace and llama-index.
rag(
text = NULL,
path = NULL,
transformer = c("TinyLLAMA", "Gemma3-1B", "Gemma3-4B", "Qwen3-1.7B", "Ministral-3B"),
prompt = "You are an expert at extracting themes across many texts",
query,
response_mode = c("accumulate", "compact", "no_text", "refine", "simple_summarize",
"tree_summarize"),
similarity_top_k = 5,
retriever = c("vector", "bm25"),
retriever_params = list(),
output = c("text", "json", "table", "csv"),
task = c("general", "emotion", "sentiment"),
labels_set = NULL,
max_labels = 5,
global_analysis = FALSE,
device = c("auto", "cpu", "cuda"),
temperature = NULL,
do_sample = NULL,
max_new_tokens = NULL,
top_p = NULL,
keep_in_env = TRUE,
envir = 1,
progress = TRUE
)
text |
Character vector or list.
Text in a vector or list data format.
|
path |
Character.
Path to .pdfs stored locally on your computer.
Defaults to |
transformer |
Character. Large language model to use for RAG. Available models include:
|
prompt |
Character (length = 1).
Prompt to feed into TinyLLAMA.
Defaults to |
query |
Character.
The query you'd like to know from the documents.
Defaults to |
response_mode |
Character (length = 1). Different responses generated from the model. See documentation here Defaults to |
similarity_top_k |
Numeric (length = 1).
Retrieves most representative texts given the Values will vary based on number of texts but some suggested values might be:
These values depend on the number and quality of texts. Adjust as necessary |
retriever |
Character (length = 1).
Retrieval backend: one of |
retriever_params |
List.
Optional parameters passed to the selected retriever handler. Reserved keys
include |
output |
Character (length = 1).
Output format: one of
|
task |
Character (length = 1).
Task hint for structured extraction: one of |
labels_set |
Character vector.
Allowed labels for classification when |
max_labels |
Integer (length = 1).
Maximum number of labels to return in structured outputs;
used to guide the model instruction when
|
global_analysis |
Boolean (length = 1).
Whether to perform analysis across all documents globally
(legacy behavior) or per-document (default).
When |
device |
Character.
Whether to use CPU or GPU for inference.
Defaults to |
temperature |
Numeric or NULL. Overrides the LLM sampling temperature when using local HF models. Recommended: 0.0–0.2 for structured/classification; 0.3–0.7 for summaries. |
do_sample |
Logical or NULL.
If |
max_new_tokens |
Integer or NULL. Maximum new tokens to generate. Suggested: 64–128 for label decisions; 256–512 for summaries. |
top_p |
Numeric or NULL.
Nucleus sampling parameter. Typical: 0.7–0.95. Use with |
keep_in_env |
Boolean (length = 1).
Whether the classifier should be kept in your global environment.
Defaults to |
envir |
Numeric (length = 1). Environment for the classifier to be saved for repeated use. Defaults to the global environment |
progress |
Boolean (length = 1).
Whether progress should be displayed.
Defaults to |
For output = "text", returns an object of class
"rag" with fields:
$response (character), $content (data.frame),
and $document_embeddings (matrix).
For output = "json", returns a JSON character(1)
string matching the enforced schema.
For output = "table", returns a data.frame
suitable for statistical analysis.
All processing is done locally with the downloaded model, and your text is never sent to any remote server or third-party.
Alexander P. Christensen <alexpaulchristensen@gmail.com>
# Load data
data(neo_ipip_extraversion)
# Example text
text <- neo_ipip_extraversion$friendliness[1:5]
## Not run:
rag(
text = text,
query = "What themes are prevalent across the text?",
response_mode = "tree_summarize",
similarity_top_k = 5
)
# Structured outputs
rag(text = text, query = "Extract emotions", output = "json")
rag(text = text, query = "Extract emotions", output = "table")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.