hom_test_polysph: Homogeneity test for several polyspherical samples

View source: R/tests.R

hom_test_polysphR Documentation

Homogeneity test for several polyspherical samples

Description

Permutation tests for the equality of distributions of two or k samples of data on \mathcal{S}^{d_1} \times \cdots \times \mathcal{S}^{d_r}. The Jensen–Shannon distance is used to construct a test statistic measuring the discrepancy between the k kernel density estimators. Tests based on the mean and scatter matrices are also available, but for only two samples (k=2).

Usage

hom_test_polysph(X, d, labels, type = c("jsd", "mean", "scatter", "hd")[1],
  h = NULL, kernel = 1, kernel_type = 1, k = 10, B = 1000,
  M = 10000, plot_boot = FALSE, seed_jsd = NULL, cv_jsd = TRUE)

Arguments

X

a matrix of size c(n, sum(d) + r) with the sample.

d

vector of size r with dimensions.

labels

vector with k different levels indicating the group.

type

kind of test to be performed: "jsd" (default), a test comparing the kernel density estimators for k groups using the Jensen–Shannon distance; "mean", a simple test for the equality of two means (non-omnibus for testing homogeneity); "scatter", a simple test for the equality of two scatter matrices; "hd", a test comparing the kernel density estimators for two groups using the Hellinger distance.

h

vector of size r with bandwidths.

kernel

kernel employed: 1 for von Mises–Fisher (default); 2 for Epanechnikov; 3 for softplus.

kernel_type

type of kernel employed: 1 for product kernel (default); 2 for spherically symmetric kernel.

k

softplus kernel parameter. Defaults to 10.0.

B

number of permutations to use. Defaults to 1e3.

M

number of Monte Carlo samples to use when approximating the Hellinger/Jensen–Shannon distance. Defaults to 1e4.

plot_boot

flag to display a graphical output of the test decision. Defaults to FALSE.

seed_jsd

seed for the Monte Carlo simulations used to estimate the integrals in the Jensen–Shannon distance in the original and bootstrapped statistics. Defaults to NULL (no seed is fixed).

cv_jsd

use cross-validation to approximate the Jensen–Shannon distance? Does not require Monte Carlo. Defaults to TRUE.

Details

Only type = "jsd" is able to deal with k > 2.

The "jsd" statistic is the Jensen–Shannon divergence. This statistic is bounded in [0, 1]. The "mean" statistic measures the maximum (chordal) distance between the estimated group means. This statistic is bounded in [0, 1]. The "scatter" statistic measures the maximum affine invariant Riemannian metric between the estimated scatter matrices. The "hd" statistic computes a monotonic transformation of the Hellinger distance, which is the Bhattacharyya divergence (or coefficient).

Value

An object of class "htest" with the following fields:

statistic

the value of the test statistic.

p.value

the p-value of the test.

statistic_perm

the B permuted statistics.

n

a table with the sample sizes per group.

h

bandwidths used.

B

number of permutations.

alternative

a character string describing the alternative hypothesis.

method

the kind of test performed.

data.name

a character string giving the name of the data.

Examples

## Two-sample case

# H0 holds
n <- c(50, 100)
X1 <- rotasym::r_vMF(n = n[1], mu = c(0, 0, 1), kappa = 1)
X2 <- rotasym::r_vMF(n = n[2], mu = c(0, 0, 1), kappa = 1)
hom_test_polysph(X = rbind(X1, X2), labels = rep(1:2, times = n),
                 d = 2, type = "jsd", h = 0.5)

# H0 does not hold
X2 <- rotasym::r_vMF(n = n[2], mu = c(0, 1, 0), kappa = 2)
hom_test_polysph(X = rbind(X1, X2), labels = rep(1:2, times = n),
                 d = 2, type = "jsd", h = 0.5)

## k-sample case

# H0 holds
n <- c(50, 100, 50)
X1 <- rotasym::r_vMF(n = n[1], mu = c(0, 0, 1), kappa = 1)
X2 <- rotasym::r_vMF(n = n[2], mu = c(0, 0, 1), kappa = 1)
X3 <- rotasym::r_vMF(n = n[3], mu = c(0, 0, 1), kappa = 1)
hom_test_polysph(X = rbind(X1, X2, X3), labels = rep(1:3, times = n),
                 d = 2, type = "jsd", h = 0.5)

# H0 does not hold
X3 <- rotasym::r_vMF(n = n[3], mu = c(0, 1, 0), kappa = 2)
hom_test_polysph(X = rbind(X1, X2, X3), labels = rep(1:3, times = n),
                 d = 2, type = "jsd", h = 0.5)


polykde documentation built on April 16, 2025, 1:11 a.m.