rag: Retrieval-augmented Generation (RAG)

View source: R/rag.R

ragR Documentation

Retrieval-augmented Generation (RAG)

Description

Performs retrieval-augmented generation {llama-index}

Currently limited to the TinyLLAMA model

Usage

rag(
  text = NULL,
  path = NULL,
  transformer = c("LLAMA-2", "Mistral-7B", "OpenChat-3.5", "Orca-2", "Phi-2",
    "TinyLLAMA"),
  prompt = "You are an expert at extracting themes across many texts",
  query,
  response_mode = c("accumulate", "compact", "no_text", "refine", "simple_summarize",
    "tree_summarize"),
  similarity_top_k = 5,
  device = c("auto", "cpu", "cuda"),
  keep_in_env = TRUE,
  envir = 1,
  progress = TRUE
)

Arguments

text

Character vector or list. Text in a vector or list data format. path will override input into text Defaults to NULL

path

Character. Path to .pdfs stored locally on your computer. Defaults to NULL

transformer

Character. Large language model to use for RAG. Available models include:

"LLAMA-2"

The largest model available (13B parameters) but also the most challenging to get up and running for Mac and Windows. Linux operating systems run smooth. The challenge comes with installing the {llama-cpp-python} module. Currently, we do not provide support for Mac and Windows users

"Mistral-7B"

Mistral's 7B parameter model that serves as a high quality but more computationally expensive (more time consuming)

"Orca-2"

More documentation soon...

"Phi-2"

More documentation soon...

"TinyLLAMA"

Default. A smaller, 1B parameter version of LLAMA-2 that offers fast inference with reasonable quality

prompt

Character (length = 1). Prompt to feed into TinyLLAMA. Defaults to "You are an expert at extracting emotional themes across many texts"

query

Character. The query you'd like to know from the documents. Defaults to prompt if not provided

response_mode

Character (length = 1). Different responses generated from the model. See documentation here

Defaults to "tree_summarize"

similarity_top_k

Numeric (length = 1). Retrieves most representative texts given the query. Larger values will provide a more comprehensive response but at the cost of computational efficiency; small values will provide a more focused response at the cost of comprehensiveness. Defaults to 5.

Values will vary based on number of texts but some suggested values might be:

40-60

Comprehensive search across all texts

20-40

Exploratory with good trade-off between comprehensive and speed

5-15

Focused search that should give generally good results

These values depend on the number and quality of texts. Adjust as necessary

device

Character. Whether to use CPU or GPU for inference. Defaults to "auto" which will use GPU over CPU (if CUDA-capable GPU is setup). Set to "cpu" to perform over CPU

keep_in_env

Boolean (length = 1). Whether the classifier should be kept in your global environment. Defaults to TRUE. By keeping the classifier in your environment, you can skip re-loading the classifier every time you run this function. TRUE is recommended

envir

Numeric (length = 1). Environment for the classifier to be saved for repeated use. Defaults to the global environment

progress

Boolean (length = 1). Whether progress should be displayed. Defaults to TRUE

Value

Returns response from TinyLLAMA

Author(s)

Alexander P. Christensen <alexpaulchristensen@gmail.com>

Examples

# Load data
data(neo_ipip_extraversion)

# Example text
text <- neo_ipip_extraversion$friendliness[1:5]

## Not run: 
rag(
 text = text,
 query = "What themes are prevalent across the text?",
 response_mode = "tree_summarize",
 similarity_top_k = 5
)
## End(Not run)


transforEmotion documentation built on Sept. 11, 2024, 9:26 p.m.