generate | R Documentation |
generate()
is the simplest possible solver one might use with
vitals; it just passes its inputs to the supplied model and returns
its raw responses. The inputs are evaluated in parallel,
not in the sense of multiple R sessions, but in the sense of multiple,
asynchronous HTTP requests using ellmer::parallel_chat()
. generate()
's output
can be passed directory to the solver
argument of Task's $new()
method.
generate(solver_chat = NULL)
solver_chat |
An ellmer chat object, such as from |
The output of generate()
is another function. That function takes in
a vector of input
s, as well as a solver chat by the
name of solver_chat
with the default supplied to generate()
itself.
See the documentation for the solver
argument in Task for more
information on the return type.
if (!identical(Sys.getenv("ANTHROPIC_API_KEY"), "")) {
# set the log directory to a temporary directory
withr::local_envvar(VITALS_LOG_DIR = withr::local_tempdir())
library(ellmer)
library(tibble)
simple_addition <- tibble(
input = c("What's 2+2?", "What's 2+3?"),
target = c("4", "5")
)
# create a new Task
tsk <- Task$new(
dataset = simple_addition,
solver = generate(chat_anthropic(model = "claude-3-7-sonnet-latest")),
scorer = model_graded_qa()
)
# evaluate the task (runs solver and scorer) and opens
# the results in the Inspect log viewer (if interactive)
tsk$eval()
# $eval() is shorthand for:
tsk$solve()
tsk$score()
tsk$measure()
tsk$log()
tsk$view()
# get the evaluation results as a data frame
tsk$get_samples()
# view the task directory with $view() or vitals_view()
vitals_view()
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.