DataSimilarity-package: Quantifying Similarity of Datasets and Multivariate Two- And...
In DataSimilarity: Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

DataSimilarity-package

R Documentation

Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

Description

A collection of methods for quantifying the similarity of two or more datasets, many of which can be used for two- or k-sample testing. It provides newly implemented methods as well as wrapper functions for existing methods that enable calling many different methods in a unified framework. The methods were selected from the review and comparison of Stolte et al. (2024) <doi:10.1214/24-SS149>. An empirical comparison of the methods was performed in Stolte et al. (2026) <doi:10.48550/arXiv.2604.11458> for categorical data and in Stolte et al. (2026) <doi:10.48550/arXiv.2604.12327> for numeric data.

Details

The DESCRIPTION file: This package was not yet installed at build time.
Index: This package was not yet installed at build time.
The package provides various methods for comparing two or more datasets or their underlying distributions. Often, a permutation or asymptotic test for the null hypothesis of equal distributions H_0: F_1 = F_2 or H_0: F_1 = \dots = F_k is performed.

Author(s)

Marieke Stolte [aut, cre, cph] (ORCID: <https://orcid.org/0009-0002-0711-6789>), Luca Sauer [aut] (ORCID: <https://orcid.org/0009-0000-1086-023X>), David Alvarez-Melis [ctb] (Original python implementation of OTDD, <https://github.com/microsoft/otdd.git>), Nabarun Deb [ctb] (Original implementation of rank-based Energy test (DS), <https://github.com/NabarunD/MultiDistFree.git>), Bodhisattva Sen [ctb] (Original implementation of rank-based Energy test (DS), <https://github.com/NabarunD/MultiDistFree.git>)

Maintainer: Marieke Stolte <marieke.stolte@ibe.med.uni-muenchen.de>

References

Stolte, M., Kappenberg, F., Rahnenführer, J., Bommert, A. (2024). Methods for quantifying dataset similarity: a review, taxonomy and comparison. Statist. Surv. 18, 163 - 298. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/24-SS149")}

Stolte, M., Kappenberg, F., Rahnenführer, J. & Bommert, A. (2024). A Comparison of Methods for Quantifying Dataset Similarity. https://shiny.statistik.tu-dortmund.de/data-similarity/

Stolte, M., Rahnenführer, J., Bommert, A. (2026). An Empirical Comparison of Methods for Quantifying the Similarity of Numeric Datasets. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2604.12327")}

Stolte, M., Rahnenführer, J., Bommert, A. (2026). An Empirical Comparison of Methods for Quantifying the Similarity of Categorical Datasets. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2604.11458")}

DataSimilarity documentation built on May 15, 2026, 9:07 a.m.

DataSimilarity index

Package overview Details on methods and implementations Getting Started with DataSimilarity

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DataSimilarity
Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

DataSimilarity-package: Quantifying Similarity of Datasets and Multivariate Two- And...
In DataSimilarity: Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

Description

Details

Author(s)

References

Related to DataSimilarity-package in DataSimilarity...

R Package Documentation

Browse R Packages

We want your feedback!

DataSimilarity Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

DataSimilarity-package: Quantifying Similarity of Datasets and Multivariate Two- And... In DataSimilarity: Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

Description

Details

Author(s)

References

Related to DataSimilarity-package in DataSimilarity...

R Package Documentation

Browse R Packages

We want your feedback!

DataSimilarity
Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing

DataSimilarity-package: Quantifying Similarity of Datasets and Multivariate Two- And...
In DataSimilarity: Quantifying Similarity of Datasets and Multivariate Two- And k-Sample Testing