mlr3 learner is now supported in PerturbationImportance (PFI, CFI, RFI) and SAGE methods.Resampling to be instantiated and consist of a single iteration, e.g. there must be only 1 test set.rsmp_all_test(task) utility can be used to construct a single-iteration Resampling object from a given Task where all observations are alligned to the test set and the train set is empty. We will likely refine the API around this in the future.ResampleResult will be constructed from the given learner, task, and resampling arguments, which is then consistent with the previous default of performing resample() to get trained learners for each resampling iteration.ci_method = "lei" for WVIM/LOCO: distribution-free inference based on
Lei et al. (2018), testing observation-wise loss differences. Defaults to
Wilcoxon signed-rank test with median aggregation. Supports t-test, Fisher
permutation, and binomial (sign) tests. Requires a decomposable measure (with $obs_loss()).p_adjust parameter in $importance() for multiplicity correction across
all ci_methods that produce p-values ("raw", "nadeau_bengio", "cpi",
"lei"). Accepts any method from stats::p.adjust.methods (e.g. "holm",
"bonferroni", "BH"). Default is "none". When "bonferroni", confidence
intervals are also adjusted (alpha/k). For other methods, only p-values are
adjusted because sequential/adaptive procedures lack a clean per-comparison
alpha for CI construction.ci_methods ("raw", "nadeau_bengio") return se, statistic, p.value,
conf_lower, and conf_upper columns. The "quantile" method returns only conf_lower
and conf_upper (no se, statistic, or p.value).ci_methods support alternative = "greater" (one-sided) or
alternative = "two.sided" (the default) to test H0: importance <= 0 vs H1: importance > 0, or
H0: importance = 0 vs H1: importance != 0, respectively.
For "quantile", alternative controls whether the interval is one-sided
("greater": finite lower bound, conf_upper = Inf) or two-sided (both bounds finite).FeatureImportanceMethod, explaining
how p-values and confidence intervals are calculated for each method.n_repeats in favor of stabilityPerturbationImportance methods (PFI, CFI, RFI): n_repeats is now 30LOCO and WVIM: n_repeats is now 30 as well.n_repeats = 1, which is obviously too small.ranger with rpart in most tests where a flexible learner was unnecessary.expect_method_output() expectation that validates all three main outputs ($importance(), $scores(), $obs_loss()) of a computed method.test_basic_workflow, test_with_resampling, test_custom_sampler) and inlined their logic at call sites for better readability.ConditionalGaussianSampler instead of ConditionalARFSampler in tests that don't specifically test ARF functionality.n_repeats values in all tests (1L for functional, 5L for plausibility).The major version bump is largely to mark the occasion that the package is now considered "released".
fippy comparison article since a more comprehensive comparison is now available in xplainfi-benchmark.min_permutations default in SAGE methods to 10 rather than 3, since the previous value was found to lead to spurious early stopping.sim_dgp_ewald lading to erroneous variances when compared to their settings.KnockoffSequentialSampler as the seqknockoff package is not available on CRAN or R-universe. KnockoffSampler with the corresponding knockoff_fun = seqknockoff::knockoffs_seq still works.sim_dgp_confounded, removing x2 which doesn't add anything interesting over x1.obs_loss() is computed (see https://github.com/mlr-org/mlr3/pull/1411).measure to be unspecified and falling back to a task_type-specific default measure$importance() gains ci_method parameter for variance estimation (#40):"none" (default): Simple aggregation without confidence intervals"raw": Uncorrected variance estimates (informative only, CIs too narrow)"nadeau_bengio": Variance correction by Nadeau & Bengio (2003) as recommended by Molnar et al. (2023)"quantile": Empirical quantile-based confidence intervals"cpi": Conditional Predictive Impact for perturbation methods (PFI/CFI/RFI), supporting t-, Wilcoxon-, Fisher-, and binomial testsPerturbationImportance methods only (not available for WVIM/LOCO or SAGE)$importance() gains standardize parameter to normalize scores to [-1, 1] range$importance() and $scores() gain relation parameter (default: "difference") to compute importances as difference or ratio of baseline and post-modification loss$compute() to avoid recomputing predictions/refits when changing aggregation methodsim_dgp_independent(): Baseline with additive independent effectssim_dgp_correlated(): Highly correlated features (PFI fails, CFI succeeds)sim_dgp_mediated(): Mediation structure (total vs direct effects)sim_dgp_confounded(): Confounding structuresim_dgp_interactions(): Interaction effects between features$obs_loss() computes observation-wise importance scores when measure has a Measure$obs_loss() method$predictions field stores prediction objects for further analysisPerturbationImportance and WVIM methods support groups parameter for grouped feature importance:groups = list(effects = c("x1", "x2", "x3"), noise = c("noise1", "noise2"))feature column contains group names instead of individual featuresmlr3fselect for cleaner internalsiters_refit → n_repeats for consistencylearner$predict_newdata_fast() for faster predictions (requires mlr3 >= 1.1.0)sampler$sample() callsNew batch_size parameter to control memory usage with large datasets
Parallelization support:
mirai or future backendsmirai::daemons() or future::plan()Parallelizes across features within each resampling iteration
Parameter renamed: iters_perm → n_repeats for consistency
$sample(feature, row_ids): Samples from stored task using row IDs$sample_newdata(feature, newdata): Samples from external dataPermutationSampler → MarginalPermutationSamplerARFSampler → ConditionalARFSamplerGaussianConditionalSampler → ConditionalGaussianSamplerKNNConditionalSampler → ConditionalKNNSamplerCtreeConditionalSampler → ConditionalCtreeSamplerStandardized parameter name: conditioning_set for features to condition on
New samplers:
MarginalSampler: Base class for marginal sampling methodsMarginalReferenceSampler: Samples complete rows from reference data (for SAGE)KnockoffSampler: Knockoff-based sampling (#16 via @mnwright)KnockoffGaussianSampler, KnockoffSequentialSamplerrow_ids-based samplingiters parameter for multiple knockoff iterationsBug fix: ConditionalSAGE now properly uses conditional sampling (was accidentally using marginal sampling)
Performance improvements:
learner$predict_newdata_fast() for faster predictionsbatch_size parameter controls memory usage for large coalitions
Convergence tracking (#29, #33):
early_stopping = TRUEse_threshold (default: 0.01)min_permutations (default: 3)check_interval permutations (default: 1)$converged: Boolean indicating if convergence was reached$n_permutations_used: Actual permutations used (may be less than requested)$convergence_history: Per-feature importance and SE over permutations$plot_convergence(): Visualize convergence curvesarf-powered conditional sampling)arf)fippyAny scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.