borg_assimilate: Assimilate Leaky Evaluation Pipelines
In BORG: Bounded Outcome Risk Guard for Model Evaluation

borg_assimilate

R Documentation

Assimilate Leaky Evaluation Pipelines

Description

borg_assimilate() attempts to automatically fix detected evaluation risks by restructuring the pipeline to eliminate information leakage.

Usage

borg_assimilate(workflow, risks = NULL, fix = "all")

Arguments

`workflow`	A list containing the evaluation workflow (same structure as `borg_validate`).
`risks`	Optional `BorgRisk` object from a previous inspection. If NULL, `borg_validate()` is called first.
`fix`	Character vector specifying which risk types to attempt to fix. Default: `"all"` attempts all rewritable violations. Other options: `"preprocessing"`, `"feature_engineering"`, `"thresholds"`.

Details

borg_assimilate() can automatically fix certain types of leakage:

Preprocessing on full data: Refits preprocessing objects using only training indices
Feature engineering leaks: Recomputes target encodings, embeddings, and derived features using train-only data
Threshold optimization: Moves threshold selection to training/validation data

Some violations cannot be automatically fixed:

Train-test index overlap (requires new split)
Target leakage in original features (requires domain intervention)
Temporal look-ahead in features (requires feature re-engineering)

Value

A list containing:

workflow: The rewritten workflow (modified in place where possible)
fixed: Character vector of risk types that were successfully fixed
unfixable: Character vector of risk types that could not be fixed
report: BorgRisk object from post-rewrite validation

Examples


# Attempt to fix a leaky workflow
workflow <- list(
  data = data.frame(x = rnorm(100), y = rnorm(100)),
  train_idx = 1:70,
  test_idx = 71:100
)
result <- borg_assimilate(workflow)

if (length(result$unfixable) > 0) {
  message("Some risks require manual intervention:")
  print(result$unfixable)
}

BORG documentation built on March 20, 2026, 5:09 p.m.