cir_stat: Statistics for testing circular uniformity

cir_stat_KuiperR Documentation

Statistics for testing circular uniformity

Description

Low-level implementation of several statistics for assessing circular uniformity on [0, 2\pi) or, equivalently, S^1:=\{{\bf x}\in R^2:||{\bf x}||=1\}.

Usage

cir_stat_Kuiper(Theta, sorted = FALSE, KS = FALSE, Stephens = FALSE)

cir_stat_Watson(Theta, sorted = FALSE, CvM = FALSE, Stephens = FALSE)

cir_stat_Watson_1976(Theta, sorted = FALSE, minus = FALSE)

cir_stat_Range(Theta, sorted = FALSE, gaps_in_Theta = FALSE,
  max_gap = TRUE)

cir_stat_Rao(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Greenwood(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Log_gaps(Theta, sorted = FALSE, gaps_in_Theta = FALSE,
  abs_val = TRUE)

cir_stat_Vacancy(Theta, a = 2 * pi, sorted = FALSE,
  gaps_in_Theta = FALSE)

cir_stat_Max_uncover(Theta, a = 2 * pi, sorted = FALSE,
  gaps_in_Theta = FALSE)

cir_stat_Num_uncover(Theta, a = 2 * pi, sorted = FALSE,
  gaps_in_Theta = FALSE, minus_val = TRUE)

cir_stat_Gini(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Gini_squared(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Ajne(Theta, Psi_in_Theta = FALSE)

cir_stat_Rothman(Theta, t = 1/3, Psi_in_Theta = FALSE)

cir_stat_Hodges_Ajne(Theta, asymp_std = FALSE, sorted = FALSE,
  use_Cressie = TRUE)

cir_stat_Cressie(Theta, t = 1/3, sorted = FALSE)

cir_stat_FG01(Theta, sorted = FALSE)

cir_stat_Rayleigh(Theta, m = 1L)

cir_stat_Bingham(Theta)

cir_stat_Hermans_Rasson(Theta, Psi_in_Theta = FALSE)

cir_stat_Gine_Gn(Theta, Psi_in_Theta = FALSE)

cir_stat_Gine_Fn(Theta, Psi_in_Theta = FALSE)

cir_stat_Pycke(Theta, Psi_in_Theta = FALSE)

cir_stat_Pycke_q(Theta, Psi_in_Theta = FALSE, q = 0.5)

cir_stat_Bakshaev(Theta, Psi_in_Theta = FALSE)

cir_stat_Riesz(Theta, Psi_in_Theta = FALSE, s = 1)

cir_stat_PCvM(Theta, Psi_in_Theta = FALSE)

cir_stat_PRt(Theta, t = 1/3, Psi_in_Theta = FALSE)

cir_stat_PAD(Theta, Psi_in_Theta = FALSE, AD = FALSE, sorted = FALSE)

cir_stat_CCF09(Theta, dirs, K_CCF09 = 25L, original = FALSE)

Arguments

Theta

a matrix of size c(n, M) with M samples of size n of circular data on [0, 2\pi). Must not contain NA's.

sorted

are the columns of Theta sorted increasingly? If TRUE, performance is improved. If FALSE (default), each column of Theta is sorted internally.

KS

compute the Kolmogorov-Smirnov statistic (which is not invariant under origin shifts) instead of the Kuiper statistic? Defaults to FALSE.

Stephens

compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision.

CvM

compute the Cramér-von Mises statistic (which is not invariant under origin shifts) instead of the Watson statistic? Defaults to FALSE.

minus

compute the invariant D_n^- instead of D_n^+? Defaults to FALSE.

gaps_in_Theta

does Theta contain the matrix of circular gaps that is obtained with
cir_gaps(Theta)? If FALSE (default), the circular gaps are computed internally.

max_gap

compute the maximum gap for the range statistic? If TRUE (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, the minimum gap is computed and rejection happens for low values.

abs_val

return the absolute value of the Darling's log gaps statistic? If TRUE (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, the signed statistic is computed and rejection happens for large absolute values.

a

a_n = a / n parameter used in the length of the arcs of the coverage-based tests. Must be positive. Defaults to 2 * pi.

minus_val

return the negative value of the (standardized) number of uncovered spacings? If TRUE (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, rejection happens for low values.

Psi_in_Theta

does Theta contain the shortest angles matrix \boldsymbol\Psi that is obtained with
Psi_mat(array(Theta, dim = c(n, 1, M)))? If FALSE (default), \boldsymbol\Psi is computed internally.

t

t parameter for the Rothman and Cressie tests, a real in (0, 1). Defaults to 1 / 3.

asymp_std

normalize the Hodges-Ajne statistic in terms of its asymptotic distribution? Defaults to FALSE.

use_Cressie

compute the Hodges-Ajne statistic as a particular case of the Cressie statistic? Defaults to TRUE as it is more efficient. If FALSE, the geometric construction in Ajne (1968) is employed.

m

integer m for the m-modal Rayleigh test. Defaults to m = 1 (the standard Rayleigh test).

q

q parameter for the Pycke "q-test", a real in (0, 1). Defaults to 1 / 2.

s

s parameter for the s-Riesz test, a real in (0, 2). Defaults to 1.

AD

compute the Anderson-Darling statistic (which is not invariant under origin shifts) instead of the Projected Anderson-Darling statistic? Defaults to FALSE.

dirs

a matrix of size c(n_proj, 2) containing n_proj random directions (in Cartesian coordinates) on S^1 to perform the CCF09 test.

K_CCF09

integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to 25.

original

return the CCF09 statistic as originally defined? If FALSE (default), a faster and equivalent statistic is computed, and rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, rejection happens for low values.

Details

Descriptions and references for most of the statistics are available in García-Portugués and Verdebout (2018).

The statistics cir_stat_PCvM and cir_stat_PRt are provided for the sake of completion, but they equal the more efficiently-implemented statistics 2 * cir_stat_Watson and cir_stat_Rothman, respectively.

Value

A matrix of size c(M, 1) containing the statistics for each of the M samples.

Warning

Be careful on avoiding the next bad usages of the functions, which will produce spurious results:

  • The entries of Theta are not in [0, 2\pi).

  • Theta does not contain the circular gaps when gaps_in_Theta = TRUE.

  • Theta is not sorted increasingly when data_sorted = TRUE.

  • Theta does not contain Psi_mat(array(Theta, dim = c(n, 1, M))) when
    Psi_in_Theta = TRUE.

  • The directions in dirs do not have unit norm.

References

García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. https://arxiv.org/abs/1804.00286.

Examples

## Sample uniform circular data

M <- 2
n <- 100
set.seed(987202226)
Theta <- r_unif_cir(n = n, M = M)

## Tests based on the empirical cumulative distribution function

# Kuiper
cir_stat_Kuiper(Theta)
cir_stat_Kuiper(Theta, Stephens = TRUE)

# Watson
cir_stat_Watson(Theta)
cir_stat_Watson(Theta, Stephens = TRUE)

# Watson (1976)
cir_stat_Watson_1976(Theta)

## Partition-based tests

# Ajne
Theta_array <- Theta
dim(Theta_array) <- c(nrow(Theta), 1, ncol(Theta))
Psi <- Psi_mat(Theta_array)
cir_stat_Ajne(Theta)
cir_stat_Ajne(Psi, Psi_in_Theta = TRUE)

# Rothman
cir_stat_Rothman(Theta, t = 0.5)
cir_stat_Rothman(Theta)
cir_stat_Rothman(Psi, Psi_in_Theta = TRUE)

# Hodges-Ajne
cir_stat_Hodges_Ajne(Theta)
cir_stat_Hodges_Ajne(Theta, use_Cressie = FALSE)

# Cressie
cir_stat_Cressie(Theta, t = 0.5)
cir_stat_Cressie(Theta)

# FG01
cir_stat_FG01(Theta)

## Spacings-based tests

# Range
cir_stat_Range(Theta)

# Rao
cir_stat_Rao(Theta)

# Greenwood
cir_stat_Greenwood(Theta)

# Log gaps
cir_stat_Log_gaps(Theta)

# Vacancy
cir_stat_Vacancy(Theta)

# Maximum uncovered spacing
cir_stat_Max_uncover(Theta)

# Number of uncovered spacings
cir_stat_Num_uncover(Theta)

# Gini mean difference
cir_stat_Gini(Theta)

# Gini mean squared difference
cir_stat_Gini_squared(Theta)

## Sobolev tests

# Rayleigh
cir_stat_Rayleigh(Theta)
cir_stat_Rayleigh(Theta, m = 2)

# Bingham
cir_stat_Bingham(Theta)

# Hermans-Rasson
cir_stat_Hermans_Rasson(Theta)
cir_stat_Hermans_Rasson(Psi, Psi_in_Theta = TRUE)

# Gine Fn
cir_stat_Gine_Fn(Theta)
cir_stat_Gine_Fn(Psi, Psi_in_Theta = TRUE)

# Gine Gn
cir_stat_Gine_Gn(Theta)
cir_stat_Gine_Gn(Psi, Psi_in_Theta = TRUE)

# Pycke
cir_stat_Pycke(Theta)
cir_stat_Pycke(Psi, Psi_in_Theta = TRUE)

# Pycke q
cir_stat_Pycke_q(Theta)
cir_stat_Pycke_q(Psi, Psi_in_Theta = TRUE)

# Bakshaev
cir_stat_Bakshaev(Theta)
cir_stat_Bakshaev(Psi, Psi_in_Theta = TRUE)

# Riesz
cir_stat_Riesz(Theta, s = 1)
cir_stat_Riesz(Psi, Psi_in_Theta = TRUE, s = 1)
# Projected Cramér-von Mises
cir_stat_PCvM(Theta)
cir_stat_PCvM(Psi, Psi_in_Theta = TRUE)

# Projected Rothman
cir_stat_PRt(Theta, t = 0.5)
cir_stat_PRt(Theta)
cir_stat_PRt(Psi, Psi_in_Theta = TRUE)

# Projected Anderson-Darling
cir_stat_PAD(Theta)
cir_stat_PAD(Psi, Psi_in_Theta = TRUE)

## Other tests

# CCF09
dirs <- r_unif_sph(n = 3, p = 2, M = 1)[, , 1]
cir_stat_CCF09(Theta, dirs = dirs)

## Connection of Kuiper and Watson statistics with KS and CvM, respectively

# Rotate sample for KS and CvM
alpha <- seq(0, 2 * pi, l = 1e4)
KS_alpha <- sapply(alpha, function(a) {
  cir_stat_Kuiper((Theta[, 2, drop = FALSE] + a) %% (2 * pi), KS = TRUE)
})
CvM_alpha <- sapply(alpha, function(a) {
  cir_stat_Watson((Theta[, 2, drop = FALSE] + a) %% (2 * pi), CvM = TRUE)
})
AD_alpha <- sapply(alpha, function(a) {
  cir_stat_PAD((Theta[, 2, drop = FALSE] + a) %% (2 * pi), AD = TRUE)
})

# Kuiper is the maximum rotated KS
plot(alpha, KS_alpha, type = "l")
abline(h = cir_stat_Kuiper(Theta[, 2, drop = FALSE]), col = 2)
points(alpha[which.max(KS_alpha)], max(KS_alpha), col = 2, pch = 16)

# Watson is the minimum rotated CvM
plot(alpha, CvM_alpha, type = "l")
abline(h = cir_stat_Watson(Theta[, 2, drop = FALSE]), col = 2)
points(alpha[which.min(CvM_alpha)], min(CvM_alpha), col = 2, pch = 16)

# Anderson-Darling is the average rotated AD?
plot(alpha, AD_alpha, type = "l")
abline(h = cir_stat_PAD(Theta[, 2, drop = FALSE]), col = 2)
abline(h = mean(AD_alpha), col = 3)

sphunif documentation built on Aug. 21, 2023, 9:11 a.m.