expect_llm_pass: Expect LLM Pass

View source: R/agent_evals.R

expect_llm_passR Documentation

Expect LLM Pass

Description

Custom testthat expectation that evaluates whether an LLM response meets specified criteria. Uses an LLM judge to assess the response.

Usage

expect_llm_pass(response, criteria, model = NULL, threshold = 0.7, info = NULL)

Arguments

response

The LLM response to evaluate (text or GenerateResult object).

criteria

Character string describing what constitutes a passing response.

model

Model to use for judging (default: same as response or gpt-4o).

threshold

Minimum score (0-1) to pass (default: 0.7).

info

Additional information to include in failure message.

Value

Invisibly returns the evaluation result.

Examples


if (interactive()) {
  test_that("agent answers math questions correctly", {
    result <- generate_text(
      model = "openai:gpt-4o",
      prompt = "What is 2 + 2?"
    )
    expect_llm_pass(result, "The response should contain the number 4")
  })
}


aisdk documentation built on May 29, 2026, 9:07 a.m.