test_rotasym: Tests of rotational symmetry for hyperspherical data

View source: R/tests.R

test_rotasymR Documentation

Tests of rotational symmetry for hyperspherical data

Description

Tests for assessing the rotational symmetry of a unit-norm random vector \mathbf{X} in S^{p-1}:=\{\mathbf{x}\in R^p:||\mathbf{x}||=1\}, p \ge 2, about a location \boldsymbol{\theta}\in S^{p-1}, from a hyperspherical sample \mathbf{X}_1,\ldots,\mathbf{X}_n\in S^{p-1}.

The vector \mathbf{X} is said to be rotational symmetric about \boldsymbol{\theta} if the distributions of \mathbf{OX} and \mathbf{X} coincide, where \mathbf{O} is any p\times p rotation matrix that fixes \boldsymbol{\theta}, i.e., \mathbf{O}\boldsymbol{\theta}=\boldsymbol{\theta}.

Usage

test_rotasym(data, theta = spherical_mean, type = c("sc", "loc", "loc_vMF",
  "hyb", "hyb_vMF")[5], Fisher = FALSE, U = NULL, V = NULL)

Arguments

data

hyperspherical data, a matrix of size c(n, p) with unit norm rows. Normalized internally if any row does not have unit norm (with a warning message). NAs are ignored.

theta

either a unit norm vector of size p giving the axis of rotational symmetry (for the specified-\boldsymbol{\theta} case) or a function that implements an estimator \hat{\boldsymbol{\theta}} of \boldsymbol{\theta} (for the unspecified-\boldsymbol{\theta} case). The default calls the spherical_mean function. See examples.

type

a character string (case insensitive) indicating the type of test to conduct:

  • "sc": "scatter" test based on the statistic Q_{\boldsymbol{\theta}}^{\mathrm{sc}}. Evaluates if the covariance matrix of the multivariate signs is isotropic.

  • "loc": "location" test based on the statistic Q_{\boldsymbol{\theta}}^{\mathrm{loc}}. Evaluates if the expectation of the multivariate signs is zero.

  • "loc_vMF": adapted "location" test, based on the statistic Q_{\mathrm{vMF}}^{\mathrm{loc}}.

  • "hyb": "hybrid" test based on the statistics Q_{\boldsymbol{\theta}}^{\mathrm{sc}} and Q_{\boldsymbol{\theta}}^{\mathrm{loc}}.

  • "hyb_vMF" (default): adapted "hybrid" test based on the statistics Q_{\boldsymbol{\theta}}^{\mathrm{sc}} and Q_{\mathrm{vMF}}^{\mathrm{loc}}.

See the details below for further explanations of the tests.

Fisher

if TRUE, then Fisher's method is employed to aggregate the scatter and location tests in the hybrid test, see details below. Otherwise, the hybrid statistic is the sum of the scatter and location statistics. Defaults to FALSE.

U

multivariate signs of data, a matrix of size c(n, p - 1). Computed if NULL (the default).

V

cosines of data, a vector of size n. Computed if NULL (the default).

Details

Descriptions of the tests:

  • The "scatter" test is locally and asymptotically optimal against tangent elliptical alternatives to rotational symmetry. However, it is not consistent against tangent von Mises–Fisher (vMF) alternatives. The asymptotic null distribution of Q_{\boldsymbol{\theta}}^{\mathrm{sc}} is unaffected if \boldsymbol{\theta} is estimated, that is, the asymptotic null distributions of Q_{\boldsymbol{\theta}}^{\mathrm{sc}} and Q_{\hat{\boldsymbol{\theta}}}^{\mathrm{sc}} are the same.

  • The "location" test is locally and asymptotically most powerful against vMF alternatives to rotational symmetry. However, it is not consistent against tangent elliptical alternatives. The asymptotic null distribution of Q_{\boldsymbol{\theta}}^{\mathrm{loc}} for known \boldsymbol{\theta} (the one implemented in test_rotasym) does change if \boldsymbol{\theta} is estimated by \hat{\boldsymbol{\theta}}. Therefore, if the test is performed with an estimated \boldsymbol{\theta} (if theta is a function) Q_{\hat{\boldsymbol{\theta}}}^{\mathrm{loc}} will not be properly calibrated. test_rotasym will give a warning in such case.

  • The "vMF location" test is a modification of the "location" test designed to make its null asymptotic distribution invariant from the estimation of \boldsymbol{\theta} (as the "scatter" test is). The test is optimal against tangent vMF alternatives with a specific, vMF-based, angular function g_vMF. Despite not being optimal against all tangent vMF alternatives, it is consistent for all of them. As the location test, it is not consistent against tangent elliptical alternatives.

  • The "hybrid" test combines (see below how) the "scatter" and "location" tests. The test is neither optimal against tangent elliptical nor tangent vMF alternatives, but it is consistent against both. Since it is based on the "location" test, if computed with an estimator \hat{\boldsymbol{\theta}}, the test statistic will not be properly calibrated. test_rotasym will give a warning in such case.

  • The "vMF hybrid" test is the analogous of the "hybrid" test but replaces the "location" test by the "vMF location" test.

The combination of the scatter and location tests in the hybrid tests is done in two different ways:

  • If Fisher = FALSE, then the scatter and location tests statistics give the hybrid test statistic

    Q^{\mathrm{hyb}}:=Q_{\boldsymbol{\theta}}^{\mathrm{sc}}+ Q_{\boldsymbol{\theta}}^{\mathrm{loc}}.

  • If Fisher = TRUE, then Fisher's method for aggregating independent tests (the two test statistics are independent under rotational symmetry) is considered, resulting the hybrid test statistic:

    Q_{\boldsymbol{\theta}}^{\mathrm{hyb}} :=-2(\log(p_{\mathrm{sc}})+\log(p_{\mathrm{loc}}))

    where p_{\mathrm{sc}} and p_{\mathrm{loc}} are the p-values of the scatter and location tests, respectively.

The hybrid test statistic Q_{\mathrm{vMF}}^{\mathrm{hyb}} follows analogously to Q_{\boldsymbol{\theta}}^{\mathrm{hyb}} by replacing Q_{\boldsymbol{\theta}}^{\mathrm{loc}} with Q_{\mathrm{vMF}}^{\mathrm{loc}}.

Finally, recall that the tests are designed to test implications of rotational symmetry. Therefore, the tests are not consistent against all types of alternatives to rotational symmetry.

Value

An object of the htest class with the following elements:

  • statistic: test statistic.

  • parameter: degrees of freedom of the chi-square distribution appearing in all the null asymptotic distributions.

  • p.value: p-value of the test.

  • method: information on the type of test performed.

  • data.name: name of the value of data.

  • U: multivariate signs of data.

  • V: cosines of data.

Author(s)

Eduardo García-Portugués, Davy Paindaveine, and Thomas Verdebout.

References

García-Portugués, E., Paindaveine, D., Verdebout, T. (2020) On optimal tests for rotational symmetry against new classes of hyperspherical distributions. Journal of the American Statistical Association, 115(532):1873–1887. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/01621459.2019.1665527")}

See Also

tangent-elliptical, tangent-vMF, spherical_mean.

Examples

## Rotational symmetry holds

# Sample data from a vMF (rotational symmetric distribution about mu)
n <- 200
p <- 10
theta <- c(1, rep(0, p - 1))
set.seed(123456789)
data_0 <- r_vMF(n = n, mu = theta, kappa = 1)

# theta known
test_rotasym(data = data_0, theta = theta, type = "sc")
test_rotasym(data = data_0, theta = theta, type = "loc")
test_rotasym(data = data_0, theta = theta, type = "loc_vMF")
test_rotasym(data = data_0, theta = theta, type = "hyb")
test_rotasym(data = data_0, theta = theta, type = "hyb", Fisher = TRUE)
test_rotasym(data = data_0, theta = theta, type = "hyb_vMF")
test_rotasym(data = data_0, theta = theta, type = "hyb_vMF", Fisher = TRUE)

# theta unknown (employs the spherical mean as estimator)
test_rotasym(data = data_0, type = "sc")
test_rotasym(data = data_0, type = "loc") # Warning
test_rotasym(data = data_0, type = "loc_vMF")
test_rotasym(data = data_0, type = "hyb") # Warning
test_rotasym(data = data_0, type = "hyb", Fisher = TRUE) # Warning
test_rotasym(data = data_0, type = "hyb_vMF")
test_rotasym(data = data_0, type = "hyb_vMF", Fisher = TRUE)

## Rotational symmetry does not hold

# Sample non-rotational symmetric data from a tangent-vMF distribution
# The scatter test is blind to these deviations, while the location tests
# are optimal
n <- 200
p <- 10
theta <- c(1, rep(0, p - 1))
mu <- c(rep(0, p - 2), 1)
kappa <- 2
set.seed(123456789)
r_V <- function(n) {
  r_g_vMF(n = n, p = p, kappa = 1)
}
data_1 <- r_TM(n = n, r_V = r_V, theta = theta, mu = mu, kappa = kappa)

# theta known
test_rotasym(data = data_1, theta = theta, type = "sc")
test_rotasym(data = data_1, theta = theta, type = "loc")
test_rotasym(data = data_1, theta = theta, type = "loc_vMF")
test_rotasym(data = data_1, theta = theta, type = "hyb")
test_rotasym(data = data_1, theta = theta, type = "hyb", Fisher = TRUE)
test_rotasym(data = data_1, theta = theta, type = "hyb_vMF")
test_rotasym(data = data_1, theta = theta, type = "hyb_vMF", Fisher = TRUE)

# theta unknown (employs the spherical mean as estimator)
test_rotasym(data = data_1, type = "sc")
test_rotasym(data = data_1, type = "loc") # Warning
test_rotasym(data = data_1, type = "loc_vMF")
test_rotasym(data = data_1, type = "hyb") # Warning
test_rotasym(data = data_1, type = "hyb", Fisher = TRUE) # Warning
test_rotasym(data = data_1, type = "hyb_vMF")
test_rotasym(data = data_1, type = "hyb_vMF", Fisher = TRUE)

# Sample non-rotational symmetric data from a tangent-elliptical distribution
# The location tests are blind to these deviations, while the
# scatter test is optimal
n <- 200
p <- 10
theta <- c(1, rep(0, p - 1))
Lambda <- matrix(0.5, nrow = p - 1, ncol = p - 1)
diag(Lambda) <- 1
set.seed(123456789)
r_V <- function(n) {
  r_g_vMF(n = n, p = p, kappa = 1)
}
data_2 <- r_TE(n = n, r_V = r_V, theta = theta, Lambda = Lambda)

# theta known
test_rotasym(data = data_2, theta = theta, type = "sc")
test_rotasym(data = data_2, theta = theta, type = "loc")
test_rotasym(data = data_2, theta = theta, type = "loc_vMF")
test_rotasym(data = data_2, theta = theta, type = "hyb")
test_rotasym(data = data_2, theta = theta, type = "hyb", Fisher = TRUE)
test_rotasym(data = data_2, theta = theta, type = "hyb_vMF")
test_rotasym(data = data_2, theta = theta, type = "hyb_vMF", Fisher = TRUE)

# theta unknown (employs the spherical mean as estimator)
test_rotasym(data = data_2, type = "sc")
test_rotasym(data = data_2, type = "loc") # Warning
test_rotasym(data = data_2, type = "loc_vMF")
test_rotasym(data = data_2, type = "hyb") # Warning
test_rotasym(data = data_2, type = "hyb", Fisher = TRUE) # Warning
test_rotasym(data = data_2, type = "hyb_vMF")
test_rotasym(data = data_2, type = "hyb_vMF", Fisher = TRUE)

## Sunspots births data

# Load data
data("sunspots_births")
sunspots_births$X <-
  cbind(cos(sunspots_births$phi) * cos(sunspots_births$theta),
        cos(sunspots_births$phi) * sin(sunspots_births$theta),
        sin(sunspots_births$phi))

# Test rotational symmetry for the 23rd cycle, specified theta
sunspots_23 <- subset(sunspots_births, cycle == 23)
test_rotasym(data = sunspots_23$X, type = "sc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_23$X, type = "loc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_23$X, type = "hyb", theta = c(0, 0, 1))

# Test rotational symmetry for the 23rd cycle, unspecified theta
spherical_loc_PCA(sunspots_23$X)
test_rotasym(data = sunspots_23$X, type = "sc", theta = spherical_loc_PCA)
test_rotasym(data = sunspots_23$X, type = "loc_vMF",
             theta = spherical_loc_PCA)
test_rotasym(data = sunspots_23$X, type = "hyb_vMF",
             theta = spherical_loc_PCA)

# Test rotational symmetry for the 22nd cycle, specified theta
sunspots_22 <- subset(sunspots_births, cycle == 22)
test_rotasym(data = sunspots_22$X, type = "sc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_22$X, type = "loc", theta = c(0, 0, 1))
test_rotasym(data = sunspots_22$X, type = "hyb", theta = c(0, 0, 1))

# Test rotational symmetry for the 22nd cycle, unspecified theta
spherical_loc_PCA(sunspots_22$X)
test_rotasym(data = sunspots_22$X, type = "sc", theta = spherical_loc_PCA)
test_rotasym(data = sunspots_22$X, type = "loc_vMF",
             theta = spherical_loc_PCA)
test_rotasym(data = sunspots_22$X, type = "hyb_vMF",
             theta = spherical_loc_PCA)

rotasym documentation built on Aug. 19, 2023, 9:06 a.m.