DS | R Documentation |
Performs the multivariate rank-based multivariate two-sample test using measure transportation by Deb and Sen (2021).
DS(X1, X2, n.perm = 0, rand.gen = NULL, seed = 42)
X1 |
First dataset as matrix or data.frame |
X2 |
Second dataset as matrix or data.frame |
n.perm |
Number of permutations for permuation test (default: 0, no permutation test performed) |
rand.gen |
Function that generates a grid of (random) numbers in |
seed |
Random seed (default: 42) |
The test proposed by Deb and Sen (2021) is a rank-based version of the Energy statistic (Székely and Rizzo, 2004) that does not rely on any moment assumptions. Its test statistic is the Energy statistic applied to the rank map of both samples. The multivariate ranks are computed using optimal transport with a multivariate uniform distribution as the reference distribution.
For the rank version of the Energy statistic it still holds that the value zero is attained if and only if the two distributions coincide. Therefore, low values of the empirical test statistic indicate similarity between the datasets and the null hypothesis of equal distributions is rejected for large values.
An object of class htest
with the following components:
statistic |
Observed value of the test statistic |
p.value |
Permutation p value |
alternative |
The alternative hypothesis |
method |
Description of the test |
data.name |
The dataset names |
Target variable? | Numeric? | Categorical? | K-sample? |
No | Yes | No | No |
The implementation is a modification of the code supplied by Deb and Sen (2021) for the simulation study presented in the original article. It generalizes the implementation and includes small modifications for computation speed.
Original implementation by Nabarun Deb, Bodhisattva Sen
Minor modifications by Marieke Stolte
Original implementation: https://github.com/NabarunD/MultiDistFree
Deb, N. and Sen, B. (2021). Multivariate Rank-Based Distribution-Free Nonparametric Testing Using Measure Transportation, Journal of the American Statistical Association. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/01621459.2021.1923508")}.
Stolte, M., Kappenberg, F., Rahnenführer, J., Bommert, A. (2024). Methods for quantifying dataset similarity: a review, taxonomy and comparison. Statist. Surv. 18, 163 - 298. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/24-SS149")}
Energy
# Draw some data
X1 <- matrix(rnorm(1000), ncol = 10)
X2 <- matrix(rnorm(1000, mean = 0.5), ncol = 10)
# Perform Deb and Sen test
if(requireNamespace("randtoolbox", quietly = TRUE) &
requireNamespace("clue", quietly = TRUE)) {
DS(X1, X2, n.perm = 100)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.