BORG-package: BORG: Bounded Outcome Risk Guard for Model Evaluation

BORG-packageR Documentation

BORG: Bounded Outcome Risk Guard for Model Evaluation

Description

Automatically detects and enforces valid model evaluation by identifying information reuse between training and evaluation data. Guards against data leakage, look-ahead bias, and invalid cross-validation schemes that inflate performance estimates. Supports temporal, spatial, and grouped evaluation structures. Based on evaluation principles described in Roberts et al. (2017) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/ecog.02881")}, Kaufman et al. (2012) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1145/2382577.2382579")}, and Kapoor & Narayanan (2023) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.patter.2023.100804")}.

BORG automatically detects and enforces valid model evaluation by identifying information reuse between training and evaluation data. It guards against:

  • Data leakage through preprocessing (normalization, imputation, PCA)

  • Look-ahead bias in temporal evaluation

  • Spatial autocorrelation violations in block CV

  • Target leakage through features derived from outcomes

  • Train-test contamination through shared identifiers

Value

No return value. This is a package-level documentation page.

Main Functions

borg

Primary interface for guarding evaluation pipelines

borg_diagnose

Diagnose data dependency structure

borg_cv

Generate valid CV schemes based on diagnosis

borg_inspect

Inspect R objects for leakage signals

borg_validate

Validate a complete evaluation workflow

borg_assimilate

Assimilate leaky pipelines into compliance

Risk Classification

BORG classifies evaluation risks as:

hard_violation

Evaluation is fundamentally invalid. Must be blocked. Examples: preprocessing on full data, train-test ID overlap, target leakage.

soft_inflation

Results are biased but bounded. Performance estimates are misleading but model ranking may be preserved. Examples: insufficient spatial block size, post-hoc subgroup analysis.

Supported Frameworks

BORG integrates with:

  • caret: trainControl, train, preProcess

  • rsample: vfold_cv, initial_split, rolling_origin

  • recipes: recipe, prep, bake

  • mlr3: Task, Learner, Resampling

  • Base R: manual index-based splitting

Options

BORG respects the following options:

borg.auto_check

If TRUE, automatically validate splits when using supported frameworks. Default: FALSE.

borg.strict

If TRUE, throw errors on hard violations. If FALSE, return warnings. Default: TRUE.

borg.verbose

If TRUE, print diagnostic messages. Default: FALSE.

Author(s)

Maintainer: Gilles Colling gilles.colling051@gmail.com (ORCID) [copyright holder]

See Also

Useful links:


BORG documentation built on March 20, 2026, 5:09 p.m.