MarginalReferenceSampler: Marginal Reference Sampler
In xplainfi: Feature Importance Methods for Global Explanations

MarginalReferenceSampler

R Documentation

Marginal Reference Sampler

Description

Samples complete observations from reference data to replace feature values. This approach samples from the marginal distribution while preserving within-row feature dependencies.

Details

This sampler implements what is called "marginal imputation" in the SAGE literature (Covert et al. 2020). For each observation, it samples a complete row from reference data and takes the specified feature values from that row. This approach:

Samples from the marginal distribution P(X_S) where S is the set of features
Preserves dependencies within the sampled reference row
Breaks dependencies between test and reference data

Terminology note: In SAGE literature, this is called "marginal imputation" because features outside the coalition are "imputed" by sampling from their marginal distribution. We use MarginalReferenceSampler to avoid confusion with missing data imputation and to clarify that it samples from reference data.

Comparison with other samplers:

MarginalPermutationSampler: Shuffles each feature independently, breaking all row structure
MarginalReferenceSampler: Samples complete rows, preserving within-row dependencies
ConditionalSampler: Samples from P(X_S | X_{-S}), conditioning on other features

Use in SAGE:

This is the default approach for MarginalSAGE. For a test observation x and features to marginalize S, it samples a reference row x_ref and creates a "hybrid" observation combining x's coalition features with x_ref's marginalized features.

Super classes

xplainfi::FeatureSampler -> xplainfi::MarginalSampler -> MarginalReferenceSampler

Public fields

reference_data: (data.table) Reference data to sample from for marginalization.

Methods

Inherited methods

Method `new()`

Creates a new instance of the MarginalReferenceSampler class.

Usage

MarginalReferenceSampler$new(task, n_samples = NULL)

Arguments

task: (mlr3::Task) Task to sample from.
n_samples: (integer(1) | NULL) Number of reference samples to use. If NULL, uses all task data as reference.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

MarginalReferenceSampler$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

References

Covert I, Lundberg S, Lee S (2020). “Understanding Global Feature Contributions With Additive Importance Measures.” In Advances in Neural Information Processing Systems, volume 33, 17212–17223. https://proceedings.neurips.cc/paper/2020/hash/c7bf0b7c1a86d5eb3be2c722cf2cf746-Abstract.html.

Examples

library(mlr3)
task = tgen("friedman1")$generate(n = 100)

# Default: uses all task data as reference
sampler = MarginalReferenceSampler$new(task)
sampled = sampler$sample("important1", row_ids = 1:10)

# Subsample reference data to 50 rows
sampler_subsampled = MarginalReferenceSampler$new(task, n_samples = 50L)
sampled2 = sampler_subsampled$sample("important1", row_ids = 1:10)

xplainfi documentation built on Feb. 27, 2026, 1:08 a.m.

xplainfi index

Package overview README.md Feature Samplers Getting Started with xplainfi Simulation Settings for Feature Importance Methods

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

xplainfi
Feature Importance Methods for Global Explanations

MarginalReferenceSampler: Marginal Reference Sampler
In xplainfi: Feature Importance Methods for Global Explanations