unif_stat_distr: Null distributions for circular and (hyper)spherical...
In sphunif: Uniformity Tests on the Circle, Sphere, and Hypersphere

unif_stat_distr

R Documentation

Null distributions for circular and (hyper)spherical uniformity statistics

Description

Approximate computation of the null distributions of several statistics for assessing uniformity on the (hyper)sphere S^{p-1}:=\{{\bf x}\in R^p:||{\bf x}||=1\}, p\ge 2. The approximation is done either by means of the asymptotic distribution or by Monte Carlo.

Usage

unif_stat_distr(x, type, p, n, approx = "asymp", M = 10000,
  stats_MC = NULL, K_max = 10000, method = "I", Stephens = FALSE,
  CCF09_dirs = NULL, CJ12_beta = 0, CJ12_reg = 3, cov_a = 2 * pi,
  Cressie_t = 1/3, K_Ajne = 500, K_CCF09 = 25, K_Kuiper = 25,
  K_Watson = 25, K_Watson_1976 = 5, Poisson_rho = 0.5, Pycke_q = 0.5,
  Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0,
  1), Softmax_kappa = 1, Stereo_a = 0, ...)

Arguments

`x`	evaluation points for the null distribution(s). Either a vector of size `nx`, if the evaluation points are common for the tests in `type`, or a matrix of size `c(nx, length(type))` with columns containing the evaluation points for each test. Must not contain `NA`'s.
`type`	type of test to be applied. A character vector containing any of the following types of tests, depending on the dimension `p`: Circular data: any of the names available at object `avail_cir_tests`. (Hyper)spherical data: any of the names available at object `avail_sph_tests`. If `type = "all"` (default), then `type` is set as `avail_cir_tests` or `avail_sph_tests`, depending on the value of `p`.
`p`	integer giving the dimension of the ambient space `R^p` that contains `S^{p-1}`.
`n`	sample size employed for computing the statistic.
`approx`	type of approximation to the null distribution, either `"asymp"` (default) for employing the asymptotic null distribution, if available, or `"MC"`, for employing the Monte Carlo approximation of the exact null distribution.
`M`	number of Monte Carlo replications for approximating the null distribution when `approx = "MC"`. Also, number of Monte Carlo samples for approximating the asymptotic distributions based on weighted sums of chi squared random variables. Defaults to `1e4`.
`stats_MC`	a data frame of size `c(M, length(type))`, with column names containing the character vector `type`, that results from extracting `$stats_MC` from a call to `unif_stat_MC`. If provided, the computation of Monte Carlo statistics when `approx = "MC"` is skipped. `stats_MC` is checked internally to see if it is sorted. Internally computed if `NULL` (default).
`K_max`	integer giving the truncation of the series that compute the asymptotic p-value of a Sobolev test. Defaults to `1e4`.
`method`	method for approximating the density, distribution, or quantile function of the weighted sum of chi squared random variables. Must be `"I"` (Imhof), `"SW"` (Satterthwaite–Welch), `"HBE"` (Hall–Buckley–Eagleson), or `"MC"` (Monte Carlo; only for distribution or quantile functions). Defaults to `"I"`.
`Stephens`	compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision.
`CCF09_dirs`	a matrix of size `c(n_proj, p)` containing `n_proj` random directions (in Cartesian coordinates) on `S^{p-1}` to perform the CCF09 test. If `NULL` (default), a sample of size `n_proj = 50` directions is computed internally.
`CJ12_beta`	`\beta` parameter in the exponential regime of CJ12 test, a positive real.
`CJ12_reg`	type of asymptotic regime for CJ12 test, either `1` (sub-exponential regime), `2` (exponential), or `3` (super-exponential; default).
`cov_a`	`a_n = a / n` parameter used in the length of the arcs of the coverage-based tests. Must be positive. Defaults to `2 * pi`.
`Cressie_t`	`t` parameter for the Cressie test, a real in `(0, 1)`. Defaults to `1 / 3`.
`K_CCF09`	integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to `25`.
`K_Kuiper`, `K_Watson`, `K_Watson_1976`, `K_Ajne`	integer giving the truncation of the series present in the null asymptotic distributions. For the Kolmogorov-Smirnov-related series defaults to `25`.
`Poisson_rho`	`\rho` parameter for the Poisson test, a real in `[0, 1)`. Defaults to `0.5`.
`Pycke_q`	`q` parameter for the Pycke "`q`-test", a real in `(0, 1)`. Defaults to `1 / 2`.
`Rayleigh_m`	integer `m` for the `m`-modal Rayleigh test. Defaults to `m = 1` (the standard Rayleigh test).
`Riesz_s`	`s` parameter for the `s`-Riesz test, a real in `(0, 2)`. Defaults to `1`.
`Rothman_t`	`t` parameter for the Rothman test, a real in `(0, 1)`. Defaults to `1 / 3`.
`Sobolev_vk2`	weights for the finite Sobolev test. A non-negative vector or matrix. Defaults to `c(0, 0, 1)`.
`Softmax_kappa`	`\kappa` parameter for the Softmax test, a non-negative real. Defaults to `1`.
`Stereo_a`	`a` parameter for the Stereo test, a real in `[-1, 1]`. Defaults to `0`.
`...`	if `approx = "MC"`, optional performance parameters to be passed to `unif_stat_MC`: `chunks`, `cores`, and `seed`.

Details

When approx = "asymp", statistics that do not have an implemented or known asymptotic are omitted, and a warning is generated.

For Sobolev tests, K_max = 1e4 produces probabilities uniformly accurate with three digits for the "PCvM", "PAD", and "PRt" tests, for dimensions p \le 11. With K_max = 5e4, these probabilities are uniformly accurate in the fourth digit. With K_max = 1e3, only two-digit uniform accuracy is obtained. Uniform accuracy deteriorates when p increases, e.g., a digit accuracy is lost when p = 51.

Descriptions and references on most of the asymptotic distributions are available in García-Portugués and Verdebout (2018).

Value

A data frame of size c(nx, length(type)), with column names given by type, that contains the values of the null distributions of the statistics evaluated at x.

References

García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.1804.00286")}.

Examples

## Asymptotic distribution

# Circular statistics
x <- seq(0, 1, l = 5)
unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10)
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10)
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, K_Ajne = 5)

# All circular statistics
unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, K_max = 1e3)

# Spherical statistics
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10)
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, M = 100)

# All spherical statistics
unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, K_max = 1e3)

## Monte Carlo distribution

# Circular statistics
x <- seq(0, 5, l = 10)
unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, approx = "MC")
unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10, approx = "MC")
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10,
                approx = "MC")

# Spherical statistics
unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10,
                approx = "MC")
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, approx = "MC")
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, approx = "MC")

## Specific arguments

# Rothman
unif_stat_distr(x = x, type = "Rothman", p = 2, n = 10, Rothman_t = 0.5,
                approx = "MC")

# CCF09
dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1]
x <- seq(0, 1, l = 10)
unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC",
                CCF09_dirs = dirs)
unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC")

# CJ12
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 3)
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 2,
               CJ12_beta = 0.01)
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 1)

## Sobolev

x <- seq(0, 1, l = 10)
vk2 <- diag(1, nrow = 3)
unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100,
                Sobolev_vk2 = vk2)
sapply(1:3, function(i)
  unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100,
                  Sobolev_vk2 = vk2[i, ])$Sobolev)
sapply(1:3, function(i)
  unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100,
                  Sobolev_vk2 = vk2[i, ], M = 1e3)$Sobolev)
unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100,
                Sobolev_vk2 = vk2, M = 1e3)

sphunif documentation built on May 29, 2024, 4:19 a.m.