cir_stat: Statistics for testing circular uniformity
In sphunif: Uniformity Tests on the Circle, Sphere, and Hypersphere

cir_stat_Kuiper

R Documentation

Statistics for testing circular uniformity

Description

Low-level implementation of several statistics for assessing circular uniformity on [0, 2\pi) or, equivalently, S^1:=\{{\bf x}\in R^2:||{\bf x}||=1\}.

Usage

cir_stat_Kuiper(Theta, sorted = FALSE, KS = FALSE, Stephens = FALSE)

cir_stat_Watson(Theta, sorted = FALSE, CvM = FALSE, Stephens = FALSE)

cir_stat_Watson_1976(Theta, sorted = FALSE, minus = FALSE)

cir_stat_Range(Theta, sorted = FALSE, gaps_in_Theta = FALSE,
  max_gap = TRUE)

cir_stat_Rao(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Greenwood(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Log_gaps(Theta, sorted = FALSE, gaps_in_Theta = FALSE,
  abs_val = TRUE)

cir_stat_Vacancy(Theta, a = 2 * pi, sorted = FALSE,
  gaps_in_Theta = FALSE)

cir_stat_Max_uncover(Theta, a = 2 * pi, sorted = FALSE,
  gaps_in_Theta = FALSE)

cir_stat_Num_uncover(Theta, a = 2 * pi, sorted = FALSE,
  gaps_in_Theta = FALSE, minus_val = TRUE)

cir_stat_Gini(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Gini_squared(Theta, sorted = FALSE, gaps_in_Theta = FALSE)

cir_stat_Ajne(Theta, Psi_in_Theta = FALSE)

cir_stat_Rothman(Theta, Psi_in_Theta = FALSE, t = 1/3)

cir_stat_Hodges_Ajne(Theta, asymp_std = FALSE, sorted = FALSE,
  use_Cressie = TRUE)

cir_stat_Cressie(Theta, t = 1/3, sorted = FALSE)

cir_stat_FG01(Theta, sorted = FALSE)

cir_stat_Rayleigh(Theta, m = 1L)

cir_stat_Bingham(Theta)

cir_stat_Hermans_Rasson(Theta, Psi_in_Theta = FALSE)

cir_stat_Gine_Gn(Theta, Psi_in_Theta = FALSE)

cir_stat_Gine_Fn(Theta, Psi_in_Theta = FALSE)

cir_stat_Pycke(Theta, Psi_in_Theta = FALSE)

cir_stat_Pycke_q(Theta, Psi_in_Theta = FALSE, q = 0.5)

cir_stat_Bakshaev(Theta, Psi_in_Theta = FALSE)

cir_stat_Riesz(Theta, Psi_in_Theta = FALSE, s = 1)

cir_stat_PCvM(Theta, Psi_in_Theta = FALSE)

cir_stat_PRt(Theta, Psi_in_Theta = FALSE, t = 1/3)

cir_stat_PAD(Theta, Psi_in_Theta = FALSE, AD = FALSE, sorted = FALSE)

cir_stat_Poisson(Theta, Psi_in_Theta = FALSE, rho = 0.5)

cir_stat_Softmax(Theta, Psi_in_Theta = FALSE, kappa = 1)

cir_stat_CCF09(Theta, dirs, K_CCF09 = 25L, original = FALSE)

Arguments

`Theta`	a matrix of size `c(n, M)` with `M` samples of size `n` of circular data on `[0, 2\pi)`. Must not contain `NA`'s.
`sorted`	are the columns of `Theta` sorted increasingly? If `TRUE`, performance is improved. If `FALSE` (default), each column of `Theta` is sorted internally.
`KS`	compute the Kolmogorov-Smirnov statistic (which is not invariant under origin shifts) instead of the Kuiper statistic? Defaults to `FALSE`.
`Stephens`	compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision.
`CvM`	compute the Cramér-von Mises statistic (which is not invariant under origin shifts) instead of the Watson statistic? Defaults to `FALSE`.
`minus`	compute the invariant `D_n^-` instead of `D_n^+`? Defaults to `FALSE`.
`gaps_in_Theta`	does `Theta` contain the matrix of circular gaps that is obtained with `cir_gaps(Theta)`? If `FALSE` (default), the circular gaps are computed internally.
`max_gap`	compute the maximum gap for the range statistic? If `TRUE` (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, the minimum gap is computed and rejection happens for low values.
`abs_val`	return the absolute value of the Darling's log gaps statistic? If `TRUE` (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, the signed statistic is computed and rejection happens for large absolute values.
`a`	either: `a_n = a / n` parameter used in the length of the arcs of the coverage-based tests. Must be positive. Defaults to `2 * pi`. `a` parameter for the Stereo test, a real in `[-1, 1]`. Defaults to `0`.
`minus_val`	return the negative value of the (standardized) number of uncovered spacings? If `TRUE` (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, rejection happens for low values.
`Psi_in_Theta`	does `Theta` contain the shortest angles matrix `\boldsymbol\Psi` that is obtained with `Psi_mat(array(Theta, dim = c(n, 1, M)))`? If `FALSE` (default), `\boldsymbol\Psi` is computed internally.
`t`	`t` parameter for the Rothman and Cressie tests, a real in `(0, 1)`. Defaults to `1 / 3`.
`asymp_std`	normalize the Hodges-Ajne statistic in terms of its asymptotic distribution? Defaults to `FALSE`.
`use_Cressie`	compute the Hodges-Ajne statistic as a particular case of the Cressie statistic? Defaults to `TRUE` as it is more efficient. If `FALSE`, the geometric construction in Ajne (1968) is employed.
`m`	integer `m` for the `m`-modal Rayleigh test. Defaults to `m = 1` (the standard Rayleigh test).
`q`	`q` parameter for the Pycke "`q`-test", a real in `(0, 1)`. Defaults to `1 / 2`.
`s`	`s` parameter for the `s`-Riesz test, a real in `(0, 2)`. Defaults to `1`.
`AD`	compute the Anderson-Darling statistic (which is not invariant under origin shifts) instead of the Projected Anderson-Darling statistic? Defaults to `FALSE`.
`rho`	`\rho` parameter for the Poisson test, a real in `[0, 1)`. Defaults to `0.5`.
`kappa`	`\kappa` parameter for the Softmax test, a non-negative real. Defaults to `1`.
`dirs`	a matrix of size `c(n_proj, 2)` containing `n_proj` random directions (in Cartesian coordinates) on `S^1` to perform the CCF09 test.
`K_CCF09`	integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to `25`.
`original`	return the CCF09 statistic as originally defined? If `FALSE` (default), a faster and equivalent statistic is computed, and rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, rejection happens for low values.

Details

Descriptions and references for most of the statistics are available in García-Portugués and Verdebout (2018).

The statistics cir_stat_PCvM and cir_stat_PRt are provided for the sake of completion, but they equal the more efficiently-implemented statistics 2 * cir_stat_Watson and cir_stat_Rothman, respectively.

Value

A matrix of size c(M, 1) containing the statistics for each of the M samples.

Warning

Be careful on avoiding the next bad usages of the functions, which will produce spurious results:

The entries of Theta are not in [0, 2\pi).
Theta does not contain the circular gaps when gaps_in_Theta = TRUE.
Theta is not sorted increasingly when data_sorted = TRUE.
Theta does not contain Psi_mat(array(Theta, dim = c(n, 1, M))) when
Psi_in_Theta = TRUE.
The directions in dirs do not have unit norm.

References

García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.1804.00286")}.

Examples

## Sample uniform circular data

M <- 2
n <- 100
set.seed(987202226)
Theta <- r_unif_cir(n = n, M = M)

## Tests based on the empirical cumulative distribution function

# Kuiper
cir_stat_Kuiper(Theta)
cir_stat_Kuiper(Theta, Stephens = TRUE)

# Watson
cir_stat_Watson(Theta)
cir_stat_Watson(Theta, Stephens = TRUE)

# Watson (1976)
cir_stat_Watson_1976(Theta)

## Partition-based tests

# Ajne
Theta_array <- Theta
dim(Theta_array) <- c(nrow(Theta), 1, ncol(Theta))
Psi <- Psi_mat(Theta_array)
cir_stat_Ajne(Theta)
cir_stat_Ajne(Psi, Psi_in_Theta = TRUE)

# Rothman
cir_stat_Rothman(Theta, t = 0.5)
cir_stat_Rothman(Theta)
cir_stat_Rothman(Psi, Psi_in_Theta = TRUE)

# Hodges-Ajne
cir_stat_Hodges_Ajne(Theta)
cir_stat_Hodges_Ajne(Theta, use_Cressie = FALSE)

# Cressie
cir_stat_Cressie(Theta, t = 0.5)
cir_stat_Cressie(Theta)

# FG01
cir_stat_FG01(Theta)

## Spacings-based tests

# Range
cir_stat_Range(Theta)

# Rao
cir_stat_Rao(Theta)

# Greenwood
cir_stat_Greenwood(Theta)

# Log gaps
cir_stat_Log_gaps(Theta)

# Vacancy
cir_stat_Vacancy(Theta)

# Maximum uncovered spacing
cir_stat_Max_uncover(Theta)

# Number of uncovered spacings
cir_stat_Num_uncover(Theta)

# Gini mean difference
cir_stat_Gini(Theta)

# Gini mean squared difference
cir_stat_Gini_squared(Theta)

## Sobolev tests

# Rayleigh
cir_stat_Rayleigh(Theta)
cir_stat_Rayleigh(Theta, m = 2)

# Bingham
cir_stat_Bingham(Theta)

# Hermans-Rasson
cir_stat_Hermans_Rasson(Theta)
cir_stat_Hermans_Rasson(Psi, Psi_in_Theta = TRUE)

# Gine Fn
cir_stat_Gine_Fn(Theta)
cir_stat_Gine_Fn(Psi, Psi_in_Theta = TRUE)

# Gine Gn
cir_stat_Gine_Gn(Theta)
cir_stat_Gine_Gn(Psi, Psi_in_Theta = TRUE)

# Pycke
cir_stat_Pycke(Theta)
cir_stat_Pycke(Psi, Psi_in_Theta = TRUE)

# Pycke q
cir_stat_Pycke_q(Theta)
cir_stat_Pycke_q(Psi, Psi_in_Theta = TRUE)

# Bakshaev
cir_stat_Bakshaev(Theta)
cir_stat_Bakshaev(Psi, Psi_in_Theta = TRUE)

# Riesz
cir_stat_Riesz(Theta, s = 1)
cir_stat_Riesz(Psi, Psi_in_Theta = TRUE, s = 1)

# Projected Cramér-von Mises
cir_stat_PCvM(Theta)
cir_stat_PCvM(Psi, Psi_in_Theta = TRUE)

# Projected Rothman
cir_stat_PRt(Theta, t = 0.5)
cir_stat_PRt(Theta)
cir_stat_PRt(Psi, Psi_in_Theta = TRUE)

# Projected Anderson-Darling
cir_stat_PAD(Theta)
cir_stat_PAD(Psi, Psi_in_Theta = TRUE)

## Other tests

# CCF09
dirs <- r_unif_sph(n = 3, p = 2, M = 1)[, , 1]
cir_stat_CCF09(Theta, dirs = dirs)

## Connection of Kuiper and Watson statistics with KS and CvM, respectively

# Rotate sample for KS and CvM
alpha <- seq(0, 2 * pi, l = 1e4)
KS_alpha <- sapply(alpha, function(a) {
  cir_stat_Kuiper((Theta[, 2, drop = FALSE] + a) %% (2 * pi), KS = TRUE)
})
CvM_alpha <- sapply(alpha, function(a) {
  cir_stat_Watson((Theta[, 2, drop = FALSE] + a) %% (2 * pi), CvM = TRUE)
})
AD_alpha <- sapply(alpha, function(a) {
  cir_stat_PAD((Theta[, 2, drop = FALSE] + a) %% (2 * pi), AD = TRUE)
})

# Kuiper is the maximum rotated KS
plot(alpha, KS_alpha, type = "l")
abline(h = cir_stat_Kuiper(Theta[, 2, drop = FALSE]), col = 2)
points(alpha[which.max(KS_alpha)], max(KS_alpha), col = 2, pch = 16)

# Watson is the minimum rotated CvM
plot(alpha, CvM_alpha, type = "l")
abline(h = cir_stat_Watson(Theta[, 2, drop = FALSE]), col = 2)
points(alpha[which.min(CvM_alpha)], min(CvM_alpha), col = 2, pch = 16)

# Anderson-Darling is the average rotated AD?
plot(alpha, AD_alpha, type = "l")
abline(h = cir_stat_PAD(Theta[, 2, drop = FALSE]), col = 2)
abline(h = mean(AD_alpha), col = 3)

sphunif documentation built on May 29, 2024, 4:19 a.m.