engineerMetric: Engineer Metric

View source: R/engineerMetric.R

engineerMetricR Documentation

Engineer Metric

Description

The function implements the L_q-engineer metric for comparing two multivariate distributions.

Usage

engineerMetric(X1, X2, type = "F", seed = 42)

Arguments

X1

First dataset as matrix or data.frame

X2

Second dataset as matrix or data.frame

type

Character specifying the type of L_q-norm to use. Reasonable options are "O", "o", "1", for the L_1-norm, "I", and "i", for the L_\infty-norm, and "F", "f", "E", "e" (the default) for the L_2-norm (Euclidean norm).

seed

Random seed (default: 42). Method is deterministic, seed is only set for consistency with other methods.

Details

The engineer is a primary propability metric that is defined as

\text{EN}(X_1, X_2; q) = \left[ \sum_{i = 1}^{p} \left| \text{E}\left(X_{1i}\right) - \text{E}\left(X_{2i}\right)\right|^q\right]^{\min(q, 1/q)} \text{ with } q> 0,

where X_{1i}, X_{2i} denote the ith component of the p-dimensional random vectors X_1\sim F_1 and X_2\sim F_2.

In the implementation, expectations are estimated by column means of the respective datasets.

Value

An object of class htest with the following components:

method

Description of the test

statistic

Observed value of the test statistic

data.name

The dataset names

method

Description of the test

alternative

The alternative hypothesis

Applicability

Target variable? Numeric? Categorical? K-sample?
No Yes No No

Note

The seed argument is only included for consistency with other methods. The result of the metric calculation is deteministic.

References

Rachev, S. T. (1991). Probability metrics and the stability of stochastic models. John Wiley & Sons, Chichester.

Stolte, M., Kappenberg, F., Rahnenführer, J., Bommert, A. (2024). Methods for quantifying dataset similarity: a review, taxonomy and comparison. Statist. Surv. 18, 163 - 298. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/24-SS149")}

See Also

Jeffreys

Examples

# Draw some data
X1 <- matrix(rnorm(1000), ncol = 10)
X2 <- matrix(rnorm(1000, mean = 0.5), ncol = 10)
# Calculate engineer metric
engineerMetric(X1, X2)

DataSimilarity documentation built on April 3, 2025, 9:39 p.m.