| llm_mutate | R Documentation |
Adds one or more columns to .data that are produced by a Large-Language-Model.
llm_mutate(
.data,
output,
prompt = NULL,
.messages = NULL,
.config,
.system_prompt = NULL,
.before = NULL,
.after = NULL,
.return = c("columns", "text", "object"),
.structured = FALSE,
.schema = NULL,
.fields = NULL,
...
)
.data |
A data.frame / tibble. |
output |
Unquoted name that becomes the new column (generative) or the prefix for embedding columns. |
prompt |
Optional glue template string for a single user turn; reference
any columns in |
.messages |
Optional named character vector of glue templates to build
a multi-turn message, using roles in |
.config |
An llm_config object (generative or embedding). |
.system_prompt |
Optional system message sent with every request when
|
.before, .after |
Standard dplyr::relocate helpers controlling where the generated column(s) are placed. |
.return |
One of |
.structured |
Logical. If |
.schema |
Optional JSON Schema (R list). When |
.fields |
Optional character vector of fields to extract from parsed JSON.
Supports nested paths (e.g., |
... |
Passed to the underlying calls: |
Multi-column injection: templating is NA-safe (NA -> empty string).
Multi-turn templating: supply .messages = c(system=..., user=..., file=...).
Duplicate role names are allowed (e.g., two user turns).
Generative mode: one request per row via call_llm_broadcast(). Parallel
execution follows the active future plan; see setup_llm_parallel().
Embedding mode: the per-row text is embedded via get_batched_embeddings().
Result expands to numeric columns named paste0(<output>, 1:N). If all rows
fail to embed, a single <output>1 column of NA is returned.
Diagnostic columns use suffixes: _finish, _sent, _rec, _tot, _reason, _ok, _err, _id, _status, _ecode, _param, _t.
.data with the new column(s) appended.
You can supply the output column and prompt in one argument:
df |> llm_mutate(answer = "{question} (hint: {hint})", .config = cfg)
df |> llm_mutate(answer = c(system = "One word.", user = "{question}"), .config = cfg)
This is equivalent to:
df |> llm_mutate(answer, prompt = "{question} (hint: {hint})", .config = cfg)
df |> llm_mutate(answer, .messages = c(system = "One word.", user = "{question}"), .config = cfg)
## Not run:
library(dplyr)
df <- tibble::tibble(
id = 1:2,
question = c("Capital of France?", "Author of 1984?"),
hint = c("European city", "English novelist")
)
cfg <- llm_config("openai", "gpt-4o-mini",
temperature = 0)
# Generative: single-turn with multi-column injection
df |>
llm_mutate(
answer,
prompt = "{question} (hint: {hint})",
.config = cfg,
.system_prompt = "Respond in one word."
)
# Generative: multi-turn via .messages (system + user)
df |>
llm_mutate(
advice,
.messages = c(
system = "You are a helpful zoologist. Keep answers short.",
user = "What is a key fact about this? {question} (hint: {hint})"
),
.config = cfg
)
# Multimodal: include an image path with role 'file'
pics <- tibble::tibble(
img = c("inst/extdata/cat.png", "inst/extdata/dog.jpg"),
prompt = c("Describe the image.", "Describe the image.")
)
pics |>
llm_mutate(
vision_desc,
.messages = c(user = "{prompt}", file = "{img}"),
.config = llm_config("openai","gpt-4.1-mini")
)
# Embeddings: output name becomes the prefix of embedding columns
emb_cfg <- llm_config("voyage", "voyage-3.5-lite",
embedding = TRUE)
df |>
llm_mutate(
vec,
prompt = "{question}",
.config = emb_cfg,
.after = id
)
# Structured output: using .structured = TRUE (equivalent to llm_mutate_structured)
schema <- list(
type = "object",
properties = list(
answer = list(type = "string"),
confidence = list(type = "number")
),
required = list("answer", "confidence")
)
df |>
llm_mutate(
result,
prompt = "{question}",
.config = cfg,
.structured = TRUE,
.schema = schema
)
# Structured with shorthand
df |>
llm_mutate(
result = "{question}",
.config = cfg,
.structured = TRUE,
.schema = schema
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.