Jeffreys: Jeffreys divergence

View source: R/Jeffreys.R

JeffreysR Documentation

Jeffreys divergence

Description

The function implements Jeffreys divergence by using KL Divergence Approximation (Sugiyama et al. 2013). By default, the implementation uses method KLIEP of function densratio from the densratio package for density ration estimation.

Usage

Jeffreys(X1, X2, method = "KLIEP", verbose = FALSE, seed = 42)

Arguments

X1

First dataset as matrix or data.frame

X2

Second dataset as matrix or data.frame

method

"KLIEP" (default), "uLSIF" or "RuLSIF"

verbose

logical (default: FALSE)

seed

Random seed (default: 42)

Details

Jeffreys divergence is calculated as the sum of the two KL-divergences

\text{KL}(F_1, F_2) = \int \log\left(\frac{f_1}{f_2}\right) \text{d}F_1

where each dataset is used as the first dataset once. As suggested by Sugiyama et al. (2013) the method KLIEP is used for density ratio estimation by default. Low values of Jeffreys Divergence indicate similarity.

Value

An object of class htest with the following components:

statistic

Observed value of the test statistic

p.value

p value

method

Description of the test

data.name

The dataset names

alternative

The alternative hypothesis

Applicability

Target variable? Numeric? Categorical? K-sample?
No Yes No No

References

Makiyama, K. (2019). densratio: Density Ratio Estimation. R package version 0.2.1, https://CRAN.R-project.org/package=densratio.

Sugiyama, M. and Liu, S. and Plessis, M. and Yamanaka, M. and Yamada, M. and Suzuki, T. and Kanamori, T. (2013). Direct Divergence Approximation between Probability Distributions and Its Applications in Machine Learning. Journal of Computing Science and Engineering. 7. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.5626/JCSE.2013.7.2.99")}

Stolte, M., Kappenberg, F., Rahnenführer, J., Bommert, A. (2024). Methods for quantifying dataset similarity: a review, taxonomy and comparison. Statist. Surv. 18, 163 - 298. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/24-SS149")}

See Also

densratio

Examples

# Draw some data
X1 <- matrix(rnorm(1000), ncol = 10)
X2 <- matrix(rnorm(1000, mean = 0.5), ncol = 10)
# Calculate Jeffreys divergence 
if(requireNamespace("densratio", quietly = TRUE)) {
  Jeffreys(X1, X2)
}

DataSimilarity documentation built on April 3, 2025, 9:39 p.m.