R/format_prompt.R

Defines functions format_prompt

Documented in format_prompt

#' Format an LLM prompt
#'
#' @description
#' Format a text prompt for a Large Language Model. Particularly useful for few-shot text classification tasks. Note that if you are planning to use one of OpenAI's chat models, like ChatGPT or GPT-4, you will want to use the `format_chat()` function instead.
#'
#' @param text The text to be classified. Can be a character vector or a single string.
#' @param instructions Instructions to be included in the prompt (format them like you would format instructions to a human research assistant).
#' @param examples A dataframe of "few-shot" examples. Must include one column called 'text' with the example text(s) and another column called "label" with the correct label(s).
#' @param template The template for how examples and completions should be formatted, in `glue` syntax. If you are including few-shot examples in the prompt, this must contain the \{text\} and \{label\} placeholders.
#' @param prompt_template The template for the entire prompt. Defaults to instructions, followed by few-shot examples, followed by the input to be classified.
#' @param separator A character that separates examples. Defaults to two carriage returns.
#'
#' @return Returns a formatted prompt that can be used as input for `complete_prompt()` or `openai::create_completion()`.
#' @export
#'
#' @examples
#' data(scotus_tweets_examples)
#'
#' format_prompt(text = "I am disappointed with this ruling.",
#'               instructions = "Decide if the sentiment of this statement is Positive or Negative.",
#'               examples = scotus_tweets_examples,
#'               template = "Statement: {text}\nSentiment: {label}")
#'
#' format_prompt(text = 'I am sad about the Supreme Court',
#'               examples = scotus_tweets_examples,
#'               template = '"{text}" is a {label} statement',
#'               separator = '\n')
format_prompt <- function(text,
                          instructions = '',
                          examples = data.frame(),
                          template = 'Text: {text}\nClassification: {label}',
                          prompt_template = '{instructions}{examples}{input}',
                          separator = '\n\n'){

  # convert examples dataframe to string
  if(nrow(examples) == 0){
    examples <- ''
  } else{
    examples <- examples |>
      dplyr::mutate(prompt_segment = glue::glue(template))

    examples <- examples$prompt_segment |>
      paste(collapse = separator) |>
      paste0(separator)
  }

  # add separator to instructions
  if(nchar(instructions) > 0){
    instructions <- paste0(instructions, separator)
  }

  # format input using template (removing the {label} tag and anything after it)
  input <- template |>
    stringr::str_replace('\\{label\\}.*', '') |>
    stringr::str_trim() |>
    glue::glue()

  # glue together the complete prompt template
  glue::glue(prompt_template)

}

Try the promptr package in your browser

Any scripts or data that you put into this service are public.

promptr documentation built on Sept. 11, 2024, 8:15 p.m.