View source: R/answer_as_dataframe.R
| answer_as_dataframe | R Documentation |
This function builds on answer_as_json() to extract a data frame from an
LLM response using structured output. The supplied schema should describe a
single row of the desired data frame, or an array of such rows. Internally,
answer_as_dataframe() standardizes the schema to a JSON object with a
rows field containing an array of row objects. This shape works well with
both text-based JSON extraction and native structured-output backends,
including 'ellmer', where arrays of objects are converted to data frames.
answer_as_dataframe(
prompt,
schema,
min_rows = NULL,
max_rows = NULL,
schema_strict = FALSE,
schema_in_prompt_as = c("example", "schema"),
type = c("auto", "text-based", "openai", "ollama", "openai_oo", "ollama_oo", "ellmer")
)
prompt |
A single string or a |
schema |
A JSON schema list or an 'ellmer' type definition describing a
single row, an array of rows, or a wrapper object containing a |
min_rows |
(optional) Minimum number of rows required in the returned data frame |
max_rows |
(optional) Maximum number of rows allowed in the returned data frame |
schema_strict |
If TRUE, the wrapped schema will be strictly enforced.
Passed through to |
schema_in_prompt_as |
Passed through to |
type |
Passed through to |
Prefer supplying an 'ellmer' row schema created with
ellmer::type_object(...) when possible. This is usually the clearest way to
describe the columns you want, and it maps cleanly to native 'ellmer'
structured output. These 'ellmer' schema definitions can also be used with
non-'ellmer' LLM providers, because 'tidyprompt' converts between 'ellmer'
schema definitions and JSON-schema representations as needed.
answer_as_dataframe() accepts the following schema shapes:
A single row schema, such as ellmer::type_object(...) or a JSON schema
object whose properties describe the columns of one row.
An array-of-rows schema, such as ellmer::type_array(row_schema) or a JSON
schema with type = "array" and row objects under items.
A wrapper object whose only property is a rows field containing an
array of row objects (matching the shape produced internally by
answer_as_dataframe()). Schemas with additional sibling properties
alongside rows are treated as row schemas, not wrappers.
Regardless of which of these forms you supply, answer_as_dataframe()
normalizes it to a row-oriented structured-output schema before delegating to
answer_as_json().
A tidyprompt() with an added prompt_wrap() which will ensure
that the LLM response is returned as a data frame.
Other pre_built_prompt_wraps:
add_image(),
add_text(),
answer_as_boolean(),
answer_as_category(),
answer_as_integer(),
answer_as_json(),
answer_as_list(),
answer_as_multi_category(),
answer_as_named_list(),
answer_as_numeric(),
answer_as_regex_match(),
answer_as_text(),
answer_by_chain_of_thought(),
answer_by_react(),
answer_using_r(),
answer_using_sql(),
answer_using_tools(),
prompt_wrap(),
quit_if(),
set_system_prompt()
Other answer_as_prompt_wraps:
answer_as_boolean(),
answer_as_category(),
answer_as_integer(),
answer_as_json(),
answer_as_list(),
answer_as_multi_category(),
answer_as_named_list(),
answer_as_numeric(),
answer_as_regex_match(),
answer_as_text()
# `answer_as_dataframe()` accepts multiple schema shapes.
# Prefer an ellmer row schema when possible, because it is concise and maps
# cleanly to native ellmer structured output.
# These ellmer schema definitions also work with non-ellmer LLM providers,
# because tidyprompt converts between ellmer schemas and JSON schemas for you.
if (requireNamespace("ellmer", quietly = TRUE)) {
person_row_schema_ellmer <- ellmer::type_object(
name = ellmer::type_string(),
age = ellmer::type_integer(),
city = ellmer::type_string()
)
# Also accepted: an array of row objects.
person_array_schema_ellmer <- ellmer::type_array(person_row_schema_ellmer)
}
# Also accepted: a JSON schema describing one row.
person_row_schema_json <- list(
type = "object",
properties = list(
name = list(type = "string"),
age = list(type = "integer"),
city = list(type = "string")
),
required = c("name", "age", "city"),
additionalProperties = FALSE
)
# Also accepted: a wrapper object with a `rows` array.
person_wrapper_schema_json <- list(
type = "object",
properties = list(
rows = list(
type = "array",
items = person_row_schema_json
)
),
required = "rows",
additionalProperties = FALSE
)
## Not run:
prompt <- paste(
"Extract the people in the following notes as a table:",
"Alice (32, Berlin), Bob (28, Utrecht)."
)
# Preferred: ellmer row schema.
# This works both with ellmer-backed providers and with regular tidyprompt
# providers, because tidyprompt converts the schema when needed.
if (requireNamespace("ellmer", quietly = TRUE)) {
prompt |>
answer_as_dataframe(person_row_schema_ellmer) |>
send_prompt()
# name age city
# 1 Alice 32 Berlin
# 2 Bob 28 Utrecht
# Also works: ellmer array-of-rows schema.
prompt |>
answer_as_dataframe(person_array_schema_ellmer) |>
send_prompt()
}
# Also works: JSON schema for one row.
prompt |>
answer_as_dataframe(person_row_schema_json, type = "text-based") |>
send_prompt()
# Also works: JSON wrapper schema with a `rows` array.
prompt |>
answer_as_dataframe(person_wrapper_schema_json, type = "text-based") |>
send_prompt()
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.