R/tabscreen_ollama.r
In AIscreenR: AI Screening Tools in R for Systematic Reviewing

Documented in tabscreen_ollama

#' @encoding UTF-8
#' @title Title and abstract screening with OLLAMA API models using function calls via the tools argument
#'
#' @name tabscreen_ollama
#' @aliases tabscreen_ollama
#'
#' @description
#' This function supports the conduct of title and abstract screening with OLLAMA API models in R.
#' Specifically, it allows the user to draw on locally hosted ollama models (e.g., Llama 3 / 3.1 variants, Mixtral/Mistral, Gemma, DeepSeek and Qwen).
#' For more information on how to install and use OLLAMA, see \url{https://docs.ollama.com/}.
#' Be aware that this function requires that you have OLLAMA installed and running on your local machine.
#' The function allows to run title and abstract screening across multiple prompts and with
#' repeated questions to check for consistency across answers. All of which can be done in parallel.
#' The function draws on the newly developed function calling which is called via the
#' tools argument in the request body. Function calls ensure more reliable and consistent responses to ones
#' requests. See [Vembye, Christensen, Mølgaard, and Schytt. (2025)](https://psycnet.apa.org/record/2026-37236-001)
#' for guidance on how adequately to conduct title and abstract screening with OLLAMA models.
#'
#' @references Vembye, M. H., Christensen, J., Mølgaard, A. B., & Schytt, F. L. W. (2025).
#'    Generative Pretrained Transformer Models Can Function as Highly Reliable Second Screeners of Titles
#'    and Abstracts in Systematic Reviews: A Proof of Concept and Common Guidelines. \emph{Psychological Methods}. 
#'    \doi{10.1037/met0000769}
#'
#'   Thomas, J. et al. (2024).
#'   Responsible AI in Evidence SynthEsis (RAISE): guidance and recommendations.
#'   \url{https://osf.io/cn7x4}
#'
#' Wickham H (2023).
#' \emph{httr2: Perform HTTP Requests and Process the Responses}.
#' \url{https://httr2.r-lib.org}, \url{https://github.com/r-lib/httr2}.
#'
#' @template common-arg
#' @param api_url Character string with the endpoint URL for OLLAMA's API. Default is `"http://127.0.0.1:11434/api/chat"`.
#' @param model Character string with the name of the OLLAMA model. Can take
#'   multiple OLLAMA models. Default = `"llama3.2:latest"`.
#'   Find available models at \url{https://ollama.com/library}.
#' @param role Character string indicate the role of the user. Default is `"user"`.
#' @param tools List of function definitions for tool calling. Default behavior is set based on `decision_description` parameter.
#'   For detailed responses, the function uses tools that include detailed description capabilities.
#' @param tool_choice Specification for which tool to use. Default behavior is set based on `decision_description` parameter.
#'   For simple responses uses "inclusion_decision_simple", for detailed responses uses "inclusion_decision".
#' @param top_p An alternative to sampling with temperature, called nucleus sampling,
#'   where the model considers the results of the tokens with top_p probability mass.
#'   So 0.1 means only the tokens comprising the top 10% probability mass are considered.
#'   We generally recommend altering this or temperature but not both. Default is 1.
#' @param time_info Logical indicating whether the run time of each
#'   request/question should be included in the data. Default = `TRUE`.
#' @param max_tries,max_seconds Cap the maximum number of attempts with
#'  `max_tries` or the total elapsed time from the first request with
#'  `max_seconds`. Default for `max_tries` is 16. If `max_tries` is not supplied,
#'  [httr2::req_perform()] will not retry.
#' @param backoff A function that takes a single argument (the number of failed
#'   attempts so far) and returns the number of seconds to wait.
#' @param after A function that takes a single argument (the response) and
#'   returns either a number of seconds to wait or `NULL`, which indicates
#'   that a precise wait time is not available that the `backoff` strategy
#'   should be used instead.
#' @param reps Numerical value indicating the number of times the same
#'   question should be sent to OLLAMA models. This can be useful to test consistency
#'   between answers. Default is `1`.
#' @param seed_par Numerical value for a seed to ensure that proper,
#'   parallel-safe random numbers are produced.
#' @param progress Logical indicating whether a progress line should be shown when running
#'   the title and abstract screening in parallel. Default is `TRUE`.
#' @param decision_description Logical indicating whether to include detailed descriptions
#'   of decisions. Default is `FALSE`. When conducting large-scale screening, we generally 
#' recommend not using this feature as it will substantially increase the time of the screening.
#' @param overinclusive Logical indicating whether uncertain decisions (`"1.1"`) should be
#'   allowed in the default function calling setup. Default is `TRUE`, which means that the 
#' default function calling setup will allow for uncertain decisions. 
#' If `FALSE`, the default function calling setup will not allow for uncertain decisions and 
#' will only return binary decisions (i.e., "1" or "0"). This argument only affects the default 
#' function calling setup.
#' @param messages Logical indicating whether to print messages embedded in the function.
#'   Default is `TRUE`.
#' @param incl_cutoff_upper Numerical value indicating the probability threshold
#'   for which a studies should be included. Default is 0.5, which indicates that
#'   titles and abstracts that the OLLAMA model has included more than 50 percent of the times
#'   should be included.
#' @param incl_cutoff_lower Numerical value indicating the probability threshold
#'   above which studies should be check by a human. Default is 0.4, which means
#'   that if you ask the OLLAMA model the same questions 10 times and it includes the
#'   title and abstract 4 times, we suggest that the study should be check by a human.
#' @param force Logical argument indicating whether to force the function to use more than
#'   10 iterations. This argument is developed to avoid the conduct of wrong and extreme sized screening.
#'   Default is `FALSE`.
#' @param ... Further argument to pass to the request body.
#'
#' @usage tabscreen_ollama(data, prompt, studyid, title, abstract, 
#' api_url = "http://127.0.0.1:11434/api/chat", ..., model, role = "user", 
#' tools = NULL, tool_choice = NULL, top_p = 1, time_info = TRUE, 
#' max_tries = 16, max_seconds = NULL, backoff = NULL, after = NULL, 
#' reps = 1, seed_par = NULL, progress = TRUE, decision_description = FALSE, 
#' overinclusive = TRUE, messages = TRUE, incl_cutoff_upper = NULL, 
#' incl_cutoff_lower = NULL, force = FALSE)
#' 
#' @return An object of class \code{"gpt"}. The object is a list containing the following
#' components:
#' \item{answer_data_aggregated}{dataset with the summarized, probabilistic inclusion decision
#' for each title and abstract across multiple repeated questions (only when reps > 1).}
#' \item{answer_data}{dataset with all individual answers.}
#' \item{error_data}{dataset with failed requests (only included if errors occurred).}
#' \item{run_date}{date when the screening was conducted.}
#'
#' @note The \code{answer_data_aggregated} data (only present when reps > 1) contains the following mandatory variables:
#' \tabular{lll}{
#'  \bold{studyid} \tab \code{integer} \tab indicating the study ID of the reference. \cr
#'  \bold{title} \tab \code{character} \tab indicating the title of the reference. \cr
#'  \bold{abstract} \tab \code{character}   \tab indicating the abstract of the reference. \cr
#'  \bold{promptid} \tab \code{integer} \tab indicating the prompt ID. \cr
#'  \bold{prompt} \tab \code{character} \tab indicating the prompt. \cr
#'  \bold{model} \tab \code{character}   \tab indicating the specific model used. \cr
#'  \bold{question} \tab \code{character} \tab indicating the final question sent to OLLAMA models. \cr
#'  \bold{top_p} \tab \code{numeric}  \tab indicating the applied top_p. \cr
#'  \bold{incl_p} \tab \code{numeric}  \tab indicating the probability of inclusion calculated across multiple repeated responses on the same title and abstract. \cr
#'  \bold{final_decision_gpt} \tab \code{character} \tab indicating the final decision reached by model - either 'Include', 'Exclude', or 'Check'. \cr
#'  \bold{final_decision_gpt_num}  \tab \code{integer}  \tab indicating the final numeric decision reached by model - either 1 or 0. \cr
#'  \bold{longest_answer}  \tab \code{character} \tab indicating the longest response obtained
#'  across multiple repeated responses on the same title and abstract. Only included if the detailed function
#'  is used. See 'Examples' below for how to use this function. \cr
#'  \bold{reps}  \tab \code{integer}  \tab indicating the number of times the same question has been sent to OLLAMA models. \cr
#'  \bold{n_mis_answers} \tab \code{integer} \tab indicating the number of missing responses. \cr
#' }
#' <br>
#' The \code{answer_data} data contains the following mandatory variables:
#' \tabular{lll}{
#'  \bold{studyid} \tab \code{integer} \tab indicating the study ID of the reference. \cr
#'  \bold{title} \tab \code{character} \tab indicating the title of the reference. \cr
#'  \bold{abstract} \tab \code{character}   \tab indicating the abstract of the reference. \cr
#'  \bold{promptid} \tab \code{integer} \tab indicating the prompt ID. \cr
#'  \bold{prompt} \tab \code{character} \tab indicating the prompt. \cr
#'  \bold{model} \tab \code{character}   \tab indicating the specific model used. \cr
#'  \bold{iterations} \tab \code{numeric} \tab indicating the number of times the same question has been sent to OLLAMA models. \cr
#'  \bold{question} \tab \code{character} \tab indicating the final question sent to OLLAMA models. \cr
#'  \bold{top_p}  \tab \code{numeric} \tab indicating the applied top_p. \cr
#'  \bold{decision_gpt}  \tab \code{character} \tab indicating the raw decision - either \code{"1", "0", "1.1"} for inclusion, exclusion, or uncertainty, respectively. \cr
#'  \bold{detailed_description}  \tab \code{character} \tab indicating detailed description of the given decision made by OLLAMA models.
#'  Only included if the detailed function is used. See 'Examples' below for how to use this function. \cr
#'  \bold{decision_binary}  \tab \code{integer} \tab indicating the binary decision,
#'  that is 1 for inclusion and 0 for exclusion. 1.1 decision are coded equal to 1 in this case. \cr
#'  \bold{run_time}  \tab \code{numeric} \tab indicating the time it took to obtain a response from the server for the given request. \cr
#'  \bold{n} \tab \code{integer} \tab indicating request ID.  \cr
#' }
#' <br>
#' If any requests failed to reach the server, the object contains an
#' error data set (`error_data`) having the same variables as `answer_data`
#' but with failed request references only.
#'
#' @importFrom stats df
#' @importFrom utils tail
#'
#' @export
#'
#' @examples
#' \dontrun{
#'
#'
#' prompt <- "Is this study about a Functional Family Therapy (FFT) intervention?"
#'
#' plan(multisession)
#'
#' tabscreen_ollama(
#'   data = filges2015_dat[1:2,],
#'   prompt = prompt,
#'   studyid = studyid,
#'   title = title,
#'   abstract = abstract,
#'   model = "llama3.2:latest",
#'   max_tries = 2
#'   )
#' plan(sequential)
#'
#'  # Get detailed descriptions of the decisions by using the
#'  # decision_description option.
#' plan(multisession)
#'
#'  tabscreen_ollama(
#'    data = filges2015_dat[1:2,],
#'    prompt = prompt,
#'    studyid = studyid,
#'    title = title,
#'    abstract = abstract,
#'    model = "llama3.2:latest",
#'    decision_description = TRUE,
#'    max_tries = 2
#'  )
#' plan(sequential)
#'}

tabscreen_ollama <- function(
  data,
  prompt,
  studyid,
  title,
  abstract,
  api_url = "http://127.0.0.1:11434/api/chat",
  ...,
  model,
  role = "user",
  tools = NULL,
  tool_choice = NULL,
  top_p = 1,
  time_info = TRUE,
  max_tries = 16,
  max_seconds = NULL,
  backoff = NULL,
  after = NULL,
  reps = 1,
  seed_par = NULL,
  progress = TRUE,
  decision_description = FALSE,
  overinclusive = TRUE,
  messages = TRUE,
  incl_cutoff_upper = NULL,
  incl_cutoff_lower = NULL,
  force = FALSE
  ){

  #.......................................
  # Handling inherited objects
  #.......................................
  if (is_gpt_tbl(data)) data <- data |> dplyr::select(-c(promptid:n)) |> tibble::as_tibble()
  if (is_gpt_agg_tbl(data)) data <- data |> dplyr::select(-c(promptid:n_mis_answers)) |> tibble::as_tibble()

  # Validate and normalize Ollama endpoint URL.
  if (!is.character(api_url) || length(api_url) != 1 || is.na(api_url) || trimws(api_url) == "") {
    stop("api_url must be a single non-empty character string.")
  }

  api_url <- trimws(api_url)

  if (!grepl("^https?://", api_url, ignore.case = TRUE)) {
    stop("api_url must start with 'http://' or 'https://'.")
  }

  if (!grepl("/chat/?$", api_url, ignore.case = TRUE)) {
    api_url_original <- api_url

    if (grepl("/api/?$", api_url, ignore.case = TRUE)) {
      api_url <- paste0(gsub("/+$", "", api_url), "/chat")
    } else if (grepl("^https?://[^/]+/?$", api_url, ignore.case = TRUE)) {
      api_url <- paste0(gsub("/+$", "", api_url), "/api/chat")
    }

    if (!identical(api_url, api_url_original) && messages) {
      message(
        paste0(
          "* 'api_url' was normalized from '", api_url_original, "' to '", api_url,
          "'. AIscreenR expects an Ollama chat endpoint."
        )
      )
    }

    if (identical(api_url, api_url_original) && messages) {
      message(
        paste0(
          "* 'api_url' does not end with '/chat'. AIscreenR expects an Ollama chat endpoint, ",
          "usually 'http://127.0.0.1:11434/api/chat'."
        )
      )
    }
  }

  #.......................................
  # Function call setup
  #.......................................
  if (!is.null(tools) && !is.list(tools)) stop("The tools function must be of a list.")

  if (is.null(tools) && !is.null(tool_choice)) stop("You must provide a tool or set 'tool_choice = NULL'.")

  # Support shorthand custom tools.
  if (!is.null(tools)) {

    # If a single tool is passed as a named list, wrap into a list of tools.
    if (!is.null(tools$name) || !is.null(tools$`function`)) {
      tools <- list(tools)
    }

    tools <- lapply(tools, function(tool_def) {
      if (!is.null(tool_def[["function"]])) return(tool_def)
      if (is.null(tool_def$name)) stop("Each custom tool must include a function name.")
      list(type = "function", "function" = tool_def)
    })

    first_tool_name <- tools[[1]][["function"]][["name"]]
    if (is.null(first_tool_name) || !is.character(first_tool_name) || length(first_tool_name) != 1 ||
        is.na(first_tool_name) || trimws(first_tool_name) == "") {
      stop("Custom tools must include a non-empty function name.")
    }

    # Parser expects tool calls to include one decision field.
    first_props <- tools[[1]][["function"]][["parameters"]][["properties"]]
    has_decision <- is.list(first_props) && ("decision_gpt" %in% names(first_props) || "decision" %in% names(first_props))
    if (!has_decision) {
      stop("Custom tools must define either 'decision_gpt' or 'decision' in parameters$properties.")
    }

    if (is.null(tool_choice)) tool_choice <- first_tool_name
  }

  # Default setting
  if (is.null(tools) && is.null(tool_choice)) {

    if (overinclusive) {

      if (!decision_description) {

        tools <- tools_simple
        tool_choice <- "inclusion_decision_simple"

      } else {

        tools <- tools_detailed
        tool_choice <- "inclusion_decision"

      }

    } else {

      if (!decision_description) {

        tools <- tools_simple_binary
        tool_choice <- "inclusion_decision_simple_binary"

      } else {

        tools <- tools_detailed_binary
        tool_choice <- "inclusion_decision_binary"

      }

    }

  }

  # Ensuring that model is provided
  if (missing(model) || is.null(model) || length(model) == 0 || !is.character(model) ||
      any(is.na(model)) || any(trimws(model) == "")) {
    stop("You must provide a model.")
  }

  #.......................................
  # Start up - Validation checks
  #.......................................

  # Validate model names
  tags_url <- sub("/chat/?$", "/tags", api_url)
  if (identical(tags_url, api_url)) tags_url <- paste0(gsub("/$", "", api_url), "/tags")
  available_models <- tryCatch({
    httr2::request(tags_url) |>
      httr2::req_method("GET") |>
      httr2::req_user_agent("AIscreenR (local-ollama)") |>
      httr2::req_perform() |>
      httr2::resp_body_json() |>
      (
        function(x) {
          if (!is.null(x$models)) vapply(x$models, function(m) m$name, character(1)) else character(0)
        }
      )()
  }, error = function(e) character(0))
  if (length(available_models) > 0) {
    if (!all(model %in% available_models)) {
      stop(paste(
        "Use one of the available models:",
        paste(available_models, collapse = ", ")
      ))
    }
  } else if (messages) {
    message(
      paste0(
        "* Could not retrieve models from Ollama at '", tags_url, "'. ",
        "Check that Ollama is running and 'api_url' is correct ",
        "(usually 'http://127.0.0.1:11434/api/chat')."
      )
    )
  }

  # Validate that each model supports tools via /api/show
  show_url <- sub("/chat/?$", "/show", api_url)
  if (identical(show_url, api_url)) show_url <- paste0(gsub("/$", "", api_url), "/show")
  model_unique <- unique(model)
  model_has_tools <- vapply(model_unique, function(model_name) {
    caps <- tryCatch({
      httr2::request(show_url) |>
        httr2::req_method("POST") |>
        httr2::req_user_agent("AIscreenR (local-ollama)") |>
        httr2::req_body_json(list(model = model_name)) |>
        httr2::req_perform() |>
        httr2::resp_body_json(simplifyVector = TRUE) |>
        (\(x) x$capabilities)()
    }, error = function(e) NULL)
    if (is.null(caps)) return(NA)
    if (is.list(caps)) {
      caps <- unlist(caps, use.names = FALSE)
    }
    any(tolower(as.character(caps)) == "tools")
  }, logical(1))

  if (length(model_has_tools) > 0 && all(is.na(model_has_tools)) && messages) {
    message(
      paste0(
        "* Could not verify model capabilities via '", show_url, "'. ",
        "If screening fails, confirm that Ollama is reachable and that your endpoint points to '/api/chat'."
      )
    )
  }

  unsupported_models <- names(model_has_tools)[!is.na(model_has_tools) & !model_has_tools]

  if (length(unsupported_models) > 0) {
    stop(
      paste0(
        "These model(s) do not support tools in Ollama: ",
        paste(unsupported_models, collapse = ", "),
        ". Please choose model(s) with 'tools' capability."
      )
    )
  }

  # Ensuring that users do not conduct wrong screening
  if (max(reps) > 10 && !force){
    max_reps_mes <- paste("* Are you sure you want to use", max(reps), "iterations? If so, set force = TRUE")
    stop(max_reps_mes)
  }

  # Ensuring that the reps argument fits to the corresponding model
  if (length(reps) > 1 && length(model) != length(reps)){
    stop("model and reps must be of the same length.")
  }

  # Default values for incl_cutoff_upper and incl_cutoff_lower when 'reps > 1'
  if (any(reps > 1)) {
    if(is.numeric(incl_cutoff_upper) && is.null(incl_cutoff_lower)) incl_cutoff_lower <- incl_cutoff_upper

    if (is.null(incl_cutoff_upper)) incl_cutoff_upper <- 0.5
    if (is.null(incl_cutoff_lower)) incl_cutoff_lower <- incl_cutoff_upper - 0.1
  }

  # Ensuring proper use of the incl_cutoff_upper and incl_cutoff_lower arguments
  if (is.numeric(incl_cutoff_upper) && is.numeric(incl_cutoff_lower) && incl_cutoff_upper < incl_cutoff_lower){
    stop("incl_cutoff_lower must not exceed incl_cutoff_upper")
  }

  # Avoiding that equivalent prompts are added to function
  if (!missing(prompt)){
    if (n_distinct(prompt) != length(prompt)) stop("Do not add the same prompt twice.")
  }

  # Ensuring that the same model is not called twice by the user
  if (n_distinct(reps) == 1 && n_distinct(model) != length(model)){
    model <- unique(model)
  }

  #.......................................
  # Collecting arguments for error handling
  #.......................................
  arg_list <-
    list(
      role = role,
      tools = tools,
      tool_choice = tool_choice,
      reps = reps,
      time_info = time_info,
      max_tries = max_tries,
      max_seconds = max_seconds,
      backoff = backoff,
      after = after,
      seed_par = seed_par,
      progress = progress,
      messages = messages,
      decision_description = decision_description,
      overinclusive = overinclusive,
      incl_cutoff_upper = incl_cutoff_upper,
      incl_cutoff_lower = incl_cutoff_lower,
      api_url = api_url,
      ...
    )

  #.......................................
  # Data manipulation
  #.......................................

  # Handle study ID creation
  study_id <- if (missing(studyid)) seq_len(nrow(data)) else data |> dplyr::pull({{ studyid }})

  dat <-
    data |>
    dplyr::mutate(
      studyid = study_id,
      studyid = factor(studyid, levels = unique(studyid))
    ) |>
    dplyr::relocate(studyid, .before = {{ title }}) |>
    dplyr::relocate({{ abstract }}, .after = {{ title }}) |>
    dplyr::relocate(c(studyid, {{ title }}, {{ abstract }}), .after = last_col())

  # Factors used for slicing data and ensuring correct length of data
  mp_reps <- if (length(reps) > 1) 1 else length(model)

  model_length <- length(model)
  prompt_length <- length(prompt)
  studyid_length <- dplyr::n_distinct(dat$studyid)

  # Creating the question dataset
  question_dat <-
    dat |>
    dplyr::mutate(
      # Handle missing/empty abstracts or titles
      dplyr::across(c({{ title }}, {{ abstract }}), ~ dplyr::if_else(
        is.na(.x) | .x == "" | .x == " " | .x == "NA", "No information", .x, missing = "No information")
      )
    ) |>
    dplyr::slice(rep(seq_len(nrow(dat)), prompt_length)) |>
    dplyr::mutate(
      promptid = rep(1:prompt_length, each = studyid_length),
      prompt = rep(prompt, each = studyid_length)
    ) |>
    dplyr::slice(rep(seq_len(dplyr::n()), each = model_length)) |>
    dplyr::mutate(
      model = rep(model, studyid_length*prompt_length),
      iterations = rep(reps, studyid_length*prompt_length*mp_reps),
      question_raw = paste0(
        prompt,
        " Now, evaluate the following title and abstract for",
        " Study ", studyid, ":",
        " -Title: ", {{ title }},
        " -Abstract: ", {{ abstract }}
      ),
      question = iconv(question_raw, from = "UTF-8", to = "ASCII//TRANSLIT", sub = " "), # Transliterate to ASCII, sub unconvertible with space
      question = stringr::str_replace_all(question, "\\s+", " "), # Normalize all whitespace to a single space
      question = trimws(question) # Trim leading/trailing whitespace
    ) |>
    dplyr::select(-question_raw) |>
    dplyr::slice(rep(seq_len(dplyr::n()), each = length(top_p))) |>
    mutate(
      topp = rep(top_p, studyid_length*prompt_length*model_length)
    ) |>
    dplyr::arrange(promptid, model, topp, iterations, studyid)

  #.......................................
  # Startup messages
  #.......................................
  if (messages){

    if (decision_description){
      message(
        paste0(
          "* Be aware that getting detailed reponses ",
          "will substantially increase the time of the screening."
        )
      )
    }

    if ("No information" %in% unique(question_dat$abstract)) {
      message(
        paste0(
          "* Consider removing references that has no abstract ",
          "since these can distort the accuracy of the screening"
        )
      )
    }
  }

  #.......................................
  # RUNNING QUESTIONS
  #.......................................
  furrr_seed <- if (is.null(seed_par)) TRUE else NULL

  # Detailed system that models must follow in order to ensure proper function calling
  forced_fn <- NULL

  if (is.character(tool_choice) && length(tool_choice) == 1 && !tool_choice %in% c("auto", "required")) {
    forced_fn <- tool_choice
  }

  if (is.list(tool_choice) && !is.null(tool_choice$type) && identical(tool_choice$type, "function") &&
      !is.null(tool_choice$`function`) && !is.null(tool_choice$`function`$name)) {
    forced_fn <- tool_choice$`function`$name
  }

  if (is.null(forced_fn) && is.list(tools) && length(tools) > 0 &&
      !is.null(tools[[1]][["function"]]) && !is.null(tools[[1]][["function"]][["name"]])) {
    forced_fn <- tools[[1]][["function"]][["name"]]
  }

  if (is.null(forced_fn)) {
    forced_fn <- if (decision_description) "inclusion_decision" else "inclusion_decision_simple"
  }

  tool_guard_msg <- paste0(
    "You are a function-calling agent. You must answer ONLY by calling the function '", forced_fn, "'. ",
    "Do NOT write any text, explanation, or reasoning outside the function call. ",
    "If you do not use the function call, your answer will be rejected. ",
    "Repeat: Only respond with a function call to '", forced_fn, "'."
  )

  params <- question_dat |>
    dplyr::select(question, model_gpt = model, topp, iterations)

  answer_dat_raw <-
    question_dat |>
    dplyr::mutate(
      res = furrr::future_pmap(
        .l = params,
        .f = .rep_ollama_engine,
        role_gpt = role,
        tool = tools,
        t_choice = tool_choice,
        system_guard_msg = tool_guard_msg,
        seeds = seed_par,
        time_inf = time_info,
        max_t = max_tries,
        max_s = max_seconds,
        back = backoff,
        aft = after,
        endpoint_url = api_url,
        ...,
        .options = furrr::furrr_options(seed = furrr_seed),
        .progress = progress
      )
    ) |>
    tidyr::unnest(res) |>
    dplyr::mutate(run_date = as.character(Sys.Date()))

  answer_dat <-
    answer_dat_raw |>
    tibble::new_tibble(class = c("gpt_tbl"))

  #.......................................
  # Catching errors
  #.......................................
  n_error <- answer_dat |> dplyr::filter(is.na(decision_binary)) |> nrow()

  if (messages){
    if (n_error == 1) message(paste("* NOTE: Requests failed for 1 title and abstract."))
    if (n_error > 1) message(paste("* NOTE: Requests failed", n_error, "times."))
  }

  if (n_error > 0) error_refs <- answer_dat |> dplyr::filter(is.na(decision_binary))

  #.......................................
  # Making aggregated data (for multiple reps)
  #.......................................
  if (any(reps > 1)) {
    answer_dat_sum <- .aggregate_res_ollama(answer_dat_raw, incl_cutoff_upper, incl_cutoff_lower)

    # Final data sum
    answer_dat_aggregated <-
      dplyr::left_join(question_dat, answer_dat_sum) |>
      suppressMessages() |>
      dplyr::select(-c(iterations)) |>
      dplyr::rename(top_p = topp) |>
      tibble::new_tibble(class = c("gpt_agg_tbl"))

    attr(answer_dat_aggregated, "incl_cutoff_upper") <- incl_cutoff_upper
    attr(answer_dat_aggregated, "incl_cutoff_lower") <- incl_cutoff_lower
  } else {
    answer_dat_aggregated <- NULL
  }

  #.......................................
  # Returned output
  #.......................................
  res <- list(
    answer_data = answer_dat,
    answer_data_aggregated = answer_dat_aggregated,
    error_data = if (n_error > 0) error_refs else NULL,
    run_date = Sys.Date()
  )

  # If no screening errors
  if (n_error == 0) res[["error_data"]] <- NULL

  # Remove aggregated data if not needed
  if (all(reps == 1)) res[["answer_data_aggregated"]] <- NULL

  # Attributing used arguments to res. Used for error handling
  attr(res, "arg_list") <- arg_list

  # Define class
  class(res) <- c("gpt", class(res))

  res
}