unif_stat_distr: Null distributions for circular and (hyper)spherical...

View source: R/unif_stat_distr.R

unif_stat_distrR Documentation

Null distributions for circular and (hyper)spherical uniformity statistics

Description

Approximate computation of the null distributions of several statistics for assessing uniformity on the (hyper)sphere S^{p-1}:=\{{\bf x}\in R^p:||{\bf x}||=1\}, p\ge 2. The approximation is done either by means of the asymptotic distribution or by Monte Carlo.

Usage

unif_stat_distr(x, type, p, n, approx = "asymp", M = 10000,
  stats_MC = NULL, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1,
  CCF09_dirs = NULL, CJ12_reg = 3, CJ12_beta = 0, Stephens = FALSE,
  K_Kuiper = 25, K_Watson = 25, K_Watson_1976 = 5, K_Ajne = 500,
  K_CCF09 = 25, K_max = 10000, ...)

Arguments

x

evaluation points for the null distribution(s). Either a vector of size nx, if the evaluation points are common for the tests in type, or a matrix of size c(nx, length(type)) with columns containing the evaluation points for each test. Must not contain NA's.

type

type of test to be applied. A character vector containing any of the following types of tests, depending on the dimension p:

  • Circular data: any of the names available at object avail_cir_tests.

  • (Hyper)spherical data: any of the names available at object avail_sph_tests.

If type = "all" (default), then type is set as avail_cir_tests or avail_sph_tests, depending on the value of p.

p

integer giving the dimension of the ambient space R^p that contains S^{p-1}.

n

sample size employed for computing the statistic.

approx

type of approximation to the null distribution, either "asymp" (default) for employing the asymptotic null distribution, if available, or "MC", for employing the Monte Carlo approximation of the exact null distribution.

M

number of Monte Carlo replications for approximating the null distribution when approx = "MC". Also, number of Monte Carlo samples for approximating the asymptotic distributions based on weighted sums of chi squared random variables. Defaults to 1e4.

stats_MC

a data frame of size c(M, length(type)), with column names containing the character vector type, that results from extracting $stats_MC from a call to unif_stat_MC. If provided, the computation of Monte Carlo statistics when approx = "MC" is skipped. stats_MC is checked internally to see if it is sorted. Internally computed if NULL (default).

Rothman_t

t parameter for the Rothman test, a real in (0, 1). Defaults to 1 / 3.

Pycke_q

q parameter for the Pycke "q-test", a real in (0, 1). Defaults to 1 / 2.

Riesz_s

s parameter for the s-Riesz test, a real in (0, 2). Defaults to 1.

CCF09_dirs

a matrix of size c(n_proj, p) containing n_proj random directions (in Cartesian coordinates) on S^{p-1} to perform the CCF09 test. If NULL (default), a sample of size n_proj = 50 directions is computed internally.

CJ12_reg

type of asymptotic regime for CJ12 test, either 1 (sub-exponential regime), 2 (exponential), or 3 (super-exponential; default).

CJ12_beta

\beta parameter in the exponential regime of CJ12 test, a positive real.

Stephens

compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision.

K_Kuiper, K_Watson, K_Watson_1976, K_Ajne

integer giving the truncation of the series present in the null asymptotic distributions. For the Kolmogorov-Smirnov-related series defaults to 25.

K_CCF09

integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to 5e2.

K_max

integer giving the truncation of the series that compute the asymptotic p-value of a Sobolev test. Defaults to 1e4.

...

if approx = "MC", optional performance parameters to be passed to
unif_stat_MC: chunks, cores, and seed.

Details

When approx = "asymp", statistics that do not have an implemented or known asymptotic are omitted, and a warning is generated.

For Sobolev tests, K_max = 1e4 produces probabilities uniformly accurate with three digits for the "PCvM", "PAD", and "PRt" tests, for dimensions p \le 11. With K_max = 5e4, these probabilities are uniformly accurate in the fourth digit. With K_max = 1e3, only two-digit uniform accuracy is obtained. Uniform accuracy deteriorates when p increases, e.g., a digit accuracy is lost when p = 51.

Descriptions and references on most of the asymptotic distributions are available in García-Portugués and Verdebout (2018).

Value

A data frame of size c(nx, length(type)), with column names given by type, that contains the values of the null distributions of the statistics evaluated at x.

References

García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. https://arxiv.org/abs/1804.00286.

Examples

## Asymptotic distribution

# Circular statistics
x <- seq(0, 1, l = 5)
unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10)
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10)
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, K_Ajne = 5)

# All circular statistics
unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, K_max = 1e3)

# Spherical statistics
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10)
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, M = 100)

# All spherical statistics
unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, K_max = 1e3)

## Monte Carlo distribution

# Circular statistics
x <- seq(0, 5, l = 10)
unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, approx = "MC")
unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10, approx = "MC")
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10,
                approx = "MC")

# Spherical statistics
unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10,
                approx = "MC")
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, approx = "MC")
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, approx = "MC")

## Specific arguments

# Rothman
unif_stat_distr(x = x, type = "Rothman", p = 2, n = 10, Rothman_t = 0.5,
                approx = "MC")

# CCF09
dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1]
x <- seq(0, 1, l = 10)
unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC",
                CCF09_dirs = dirs)
unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC")

# CJ12
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 3)
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 2,
               CJ12_beta = 0.01)
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 1)


sphunif documentation built on Aug. 21, 2023, 9:11 a.m.