unif_stat_distr: Null distributions for circular and (hyper)spherical...

View source: R/unif_stat_distr.R

unif_stat_distrR Documentation

Null distributions for circular and (hyper)spherical uniformity statistics

Description

Approximate computation of the null distributions of several statistics for assessing uniformity on the (hyper)sphere S^{p-1}:=\{{\bf x}\in R^p:||{\bf x}||=1\}, p\ge 2. The approximation is done either by means of the asymptotic distribution or by Monte Carlo.

Usage

unif_stat_distr(x, type, p, n, approx = "asymp", M = 10000,
  stats_MC = NULL, K_max = 10000, method = "I", Stephens = FALSE,
  CCF09_dirs = NULL, CJ12_beta = 0, CJ12_reg = 3, cov_a = 2 * pi,
  Cressie_t = 1/3, K_Ajne = 500, K_CCF09 = 25, K_Kuiper = 25,
  K_Watson = 25, K_Watson_1976 = 5, Poisson_rho = 0.5, Pycke_q = 0.5,
  Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0,
  1), Softmax_kappa = 1, Stereo_a = 0, ...)

Arguments

x

evaluation points for the null distribution(s). Either a vector of size nx, if the evaluation points are common for the tests in type, or a matrix of size c(nx, length(type)) with columns containing the evaluation points for each test. Must not contain NA's.

type

type of test to be applied. A character vector containing any of the following types of tests, depending on the dimension p:

  • Circular data: any of the names available at object avail_cir_tests.

  • (Hyper)spherical data: any of the names available at object avail_sph_tests.

If type = "all" (default), then type is set as avail_cir_tests or avail_sph_tests, depending on the value of p.

p

integer giving the dimension of the ambient space R^p that contains S^{p-1}.

n

sample size employed for computing the statistic.

approx

type of approximation to the null distribution, either "asymp" (default) for employing the asymptotic null distribution, if available, or "MC", for employing the Monte Carlo approximation of the exact null distribution.

M

number of Monte Carlo replications for approximating the null distribution when approx = "MC". Also, number of Monte Carlo samples for approximating the asymptotic distributions based on weighted sums of chi squared random variables. Defaults to 1e4.

stats_MC

a data frame of size c(M, length(type)), with column names containing the character vector type, that results from extracting $stats_MC from a call to unif_stat_MC. If provided, the computation of Monte Carlo statistics when approx = "MC" is skipped. stats_MC is checked internally to see if it is sorted. Internally computed if NULL (default).

K_max

integer giving the truncation of the series that compute the asymptotic p-value of a Sobolev test. Defaults to 1e4.

method

method for approximating the density, distribution, or quantile function of the weighted sum of chi squared random variables. Must be "I" (Imhof), "SW" (Satterthwaite–Welch), "HBE" (Hall–Buckley–Eagleson), or "MC" (Monte Carlo; only for distribution or quantile functions). Defaults to "I".

Stephens

compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision.

CCF09_dirs

a matrix of size c(n_proj, p) containing n_proj random directions (in Cartesian coordinates) on S^{p-1} to perform the CCF09 test. If NULL (default), a sample of size n_proj = 50 directions is computed internally.

CJ12_beta

\beta parameter in the exponential regime of CJ12 test, a positive real.

CJ12_reg

type of asymptotic regime for CJ12 test, either 1 (sub-exponential regime), 2 (exponential), or 3 (super-exponential; default).

cov_a

a_n = a / n parameter used in the length of the arcs of the coverage-based tests. Must be positive. Defaults to 2 * pi.

Cressie_t

t parameter for the Cressie test, a real in (0, 1). Defaults to 1 / 3.

K_CCF09

integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to 25.

K_Kuiper, K_Watson, K_Watson_1976, K_Ajne

integer giving the truncation of the series present in the null asymptotic distributions. For the Kolmogorov-Smirnov-related series defaults to 25.

Poisson_rho

\rho parameter for the Poisson test, a real in [0, 1). Defaults to 0.5.

Pycke_q

q parameter for the Pycke "q-test", a real in (0, 1). Defaults to 1 / 2.

Rayleigh_m

integer m for the m-modal Rayleigh test. Defaults to m = 1 (the standard Rayleigh test).

Riesz_s

s parameter for the s-Riesz test, a real in (0, 2). Defaults to 1.

Rothman_t

t parameter for the Rothman test, a real in (0, 1). Defaults to 1 / 3.

Sobolev_vk2

weights for the finite Sobolev test. A non-negative vector or matrix. Defaults to c(0, 0, 1).

Softmax_kappa

\kappa parameter for the Softmax test, a non-negative real. Defaults to 1.

Stereo_a

a parameter for the Stereo test, a real in [-1, 1]. Defaults to 0.

...

if approx = "MC", optional performance parameters to be passed to
unif_stat_MC: chunks, cores, and seed.

Details

When approx = "asymp", statistics that do not have an implemented or known asymptotic are omitted, and a warning is generated.

For Sobolev tests, K_max = 1e4 produces probabilities uniformly accurate with three digits for the "PCvM", "PAD", and "PRt" tests, for dimensions p \le 11. With K_max = 5e4, these probabilities are uniformly accurate in the fourth digit. With K_max = 1e3, only two-digit uniform accuracy is obtained. Uniform accuracy deteriorates when p increases, e.g., a digit accuracy is lost when p = 51.

Descriptions and references on most of the asymptotic distributions are available in García-Portugués and Verdebout (2018).

Value

A data frame of size c(nx, length(type)), with column names given by type, that contains the values of the null distributions of the statistics evaluated at x.

References

García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.1804.00286")}.

Examples

## Asymptotic distribution

# Circular statistics
x <- seq(0, 1, l = 5)
unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10)
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10)
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, K_Ajne = 5)

# All circular statistics
unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, K_max = 1e3)

# Spherical statistics
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10)
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, M = 100)

# All spherical statistics
unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, K_max = 1e3)

## Monte Carlo distribution

# Circular statistics
x <- seq(0, 5, l = 10)
unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, approx = "MC")
unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10, approx = "MC")
unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10,
                approx = "MC")

# Spherical statistics
unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10,
                approx = "MC")
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, approx = "MC")
unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"),
                p = 3, n = 10, approx = "MC")

## Specific arguments

# Rothman
unif_stat_distr(x = x, type = "Rothman", p = 2, n = 10, Rothman_t = 0.5,
                approx = "MC")

# CCF09
dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1]
x <- seq(0, 1, l = 10)
unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC",
                CCF09_dirs = dirs)
unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC")

# CJ12
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 3)
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 2,
               CJ12_beta = 0.01)
unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 1)

## Sobolev

x <- seq(0, 1, l = 10)
vk2 <- diag(1, nrow = 3)
unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100,
                Sobolev_vk2 = vk2)
sapply(1:3, function(i)
  unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100,
                  Sobolev_vk2 = vk2[i, ])$Sobolev)
sapply(1:3, function(i)
  unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100,
                  Sobolev_vk2 = vk2[i, ], M = 1e3)$Sobolev)
unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100,
                Sobolev_vk2 = vk2, M = 1e3)


sphunif documentation built on May 29, 2024, 4:19 a.m.