| delta_lsi | R Documentation |
Compares a naive (potentially leaky) cross-validation pipeline against a guarded (leakage-protected) pipeline and quantifies leakage-induced performance inflation using the Leakage Sensitivity Index (LSI).
delta_lsi(
fit_leaky,
fit_guarded,
metric = "auc",
exchangeability = c("iid", "by_group", "within_batch", "blocked_time"),
learner = NULL,
higher_is_better = NULL,
block_size = NULL,
M_boot = 2000L,
M_flip = 10000L,
strict = FALSE,
return_details = FALSE,
seed = 42L,
...
)
fit_leaky |
A |
fit_guarded |
A |
metric |
Character. Performance metric to compare. Must appear in
|
exchangeability |
Character. Exchangeability assumption for the
sign-flip test. One of |
learner |
Optional character. Learner name to select from multi-learner
fits. If |
higher_is_better |
Logical or |
block_size |
Integer or |
M_boot |
Integer. Number of bootstrap samples for BCa CI (default 2000). |
M_flip |
Integer. Maximum Monte Carlo samples for sign-flip test when R_eff > 15 (default 10000). |
strict |
Logical. If |
return_details |
Logical. If |
seed |
Integer. Random seed for bootstrap and sign-flip test. |
... |
Unused. Reserved for deprecated aliases such as
|
For each fit, per-fold metric values are extracted from fit@metrics
(or recomputed from fit@predictions if necessary). Fold test-set
sizes are used as weights to aggregate fold metrics into per-repeat
estimates \mu_r. The repeat-level delta
\Delta_r = s \cdot (\mu_r^{\text{naive}} - \mu_r^{\text{guarded}})
captures leakage-induced performance inflation for each CV repeat, where
s = +1 for higher-is-better metrics (e.g., AUC) and s = -1
for lower-is-better metrics (e.g., RMSE), so that \Delta_r > 0
always indicates the naive pipeline is more optimistic than the guarded one.
The delta_lsi point estimate is the Huber M-estimator (k = 1.345)
applied to \{\Delta_r\}, which is robust to occasional outlier
repeats. delta_metric is the arithmetic mean of \{\Delta_r\}
for easy interpretation in the original metric's units.
Pairing requires that fit_leaky and fit_guarded share
identical fold structures (same test-set membership per fold) in
addition to the same number of repeats. When repeat counts match but fold
structures differ, a warning is issued and the fits are treated as unpaired.
When R_{\text{eff}} \geq 5 (equal, paired repeats), a sign-flip
randomization test (Phipson & Smyth, 2010) is performed: under
H_0 (no leakage) the sign of each \Delta_r is exchangeable.
All 2^R sign combinations are enumerated exactly for
R \leq 15 (no continuity correction); Monte Carlo sampling is used
for larger R with the Phipson & Smyth (2010) correction.
BCa bootstrap confidence intervals (Efron, 1987) require
R_{\text{eff}} \geq 10.
"A_full_inference"R_eff >= 20: point + BCa CI + sign-flip p-value; inference_ok = TRUE
"B_signflip_ci"10 <= R_eff < 20: point + sign-flip p-value + BCa CI
"C_signflip"5 <= R_eff < 10: point + sign-flip p-value (no CI)
"D_insufficient"R_eff < 5 or unpaired: point estimate only
A LeakDeltaLSI object.
audit_leakage, fit_resample
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.