`sphunif`: Uniformity Tests on the Circle, Sphere, and Hypersphere
In sphunif: Uniformity Tests on the Circle, Sphere, and Hypersphere

options(rmarkdown.html_vignette.check_title = FALSE)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Just give me a quick example!

Circular data

Suppose that you want to test if a sample of circular data is uniformly distributed. For example, the following circular uniform sample in radians:

set.seed(987202226)
cir_data <- runif(n = 30, min = 0, max = 2 * pi)

Then you can call the main function in the sphunif package, unif_test, specifying the type of test to be performed. For example, the Watson (omnibus) test:

library(sphunif)
unif_test(data = cir_data, type = "Watson") # An htest object

By default, the calibration of the test statistic is done by Monte Carlo. This can be changed with p_value = "asymp" to use the asymptotic distribution:

unif_test(data = cir_data, type = "Watson", p_value = "MC") # Monte Carlo
unif_test(data = cir_data, type = "Watson", p_value = "asymp") # Asymp. distr.

You can perform several tests within a single call to unif_test. Choose the available circular uniformity tests from

avail_cir_tests

For example, you can use the Projected Anderson--Darling (@Garcia-Portugues2023, also an omnibus test) test and the Watson test:

# A *list* of htest objects
unif_test(data = cir_data, type = c("PAD", "Watson"))

@Garcia-Portugues2018 gives a review of different tests of uniformity on the circle. Section 5.1 in @Pewsey2021 also gives an overview of recent advances.

Spherical data

Suppose now that you want to test if a sample of spherical data is uniformly distributed. Consider a non-uniformly-generated^[Uniformly-distributed polar coordinates do not translate into uniformly-distributed Cartesian coordinates!] sample of directions in Cartesian coordinates, such as:

# Sample data on S^2
set.seed(987202226)
theta <- runif(n = 30, min = 0, max = 2 * pi)
phi <- runif(n = 30, min = 0, max = pi)
sph_data <- cbind(cos(theta) * sin(phi), sin(theta) * sin(phi), cos(phi))

The available spherical uniformity tests:

avail_sph_tests

See again @Garcia-Portugues2018 for a review of tests of uniformity on the sphere and complementary Section 5.1 in @Pewsey2021.

The default type = "all" equals type = avail_sph_tests, which might be too much for standard analysis:

unif_test(data = sph_data, type = "all", p_value = "MC", M = 1e2)
unif_test(data = sph_data, type = "Rayleigh", p_value = "asymp")

Higher dimensions

The hyperspherical setting is treated analogously to the spherical setting, and the available tests are exactly the same (avail_sph_tests). Here is an example of testing uniformity with a sample of size 100 on the $9$-sphere:

# Sample data on S^9
hyp_data <- r_unif_sph(n = 30, p = 10)

# Test
unif_test(data = hyp_data, type = "Rayleigh", p_value = "asymp")

A data example: are Venusian craters uniformly distributed?

The following snippet partially reproduces the data application in @Garcia-Portugues2021 on testing the uniformity of known Venusian craters. Incidentally, it also illustrates how to monitor the progress of a Monte Carlo simulation when p_value = "MC" using progressr.

# Load spherical data
data(venus)
head(venus)
nrow(venus)

# Compute Cartesian coordinates from polar coordinates
venus$X <- cbind(cos(venus$theta) * cos(venus$phi),
                 sin(venus$theta) * cos(venus$phi),
                 sin(venus$phi))

# Test uniformity using the Projected Cramér-von Mises and Projected
# Anderson-Darling tests on S^2, both using asymptotic distributions
unif_test(data = venus$X, type = c("PCvM", "PAD"), p_value = "asymp")

# Define a handler for progressr
require(progress)
require(progressr)
handlers(handler_progress(
  format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:",
                 ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"),
  clear = FALSE))

# Test uniformity using Monte-Carlo approximated null distributions
with_progress(
  unif_test(data = venus$X, type = c("PCvM", "PAD"),
            p_value = "MC", chunks = 1e2, M = 5e2, cores = 2)
)

Simulation studies done simple

unif_stat allows to compute several statistics to different samples within a single call, thus facilitating Monte Carlo experiments:

# M samples of size n on S^2
samps_sph <- r_unif_sph(n = 30, p = 3, M = 10)

# Apply all statistics to the M samples
unif_stat(data = samps_sph, type = "all")

Additionally, unif_stat_MC is an utility for performing the previous simulation through a convenient parallel wrapper:

# Break the simulation in 10 chunks of tasks to be divided between 2 cores
sim <- unif_stat_MC(n = 30, type = "all", p = 3, M = 1e2, cores = 2,
                    chunks = 10)

# Critical values for test statistics
sim$crit_val_MC

# Test statistics
head(sim$stats_MC)

# Power computation using a pre-built sampler for the alternative
pow <- unif_stat_MC(n = 30, type = "all", p = 3, M = 1e2, cores = 2,
                    chunks = 10, r_H1 = r_alt, crit_val = sim$crit_val_MC,
                    alt = "vMF", kappa = 1)
pow$power_MC

# How to use a custom sampler according to ?unif_stat_MC
r_H1 <- function(n, p, M, l = 1) {

  samp <- array(dim = c(n, p, M))
  for (j in 1:M) {

    samp[, , j] <- mvtnorm::rmvnorm(n = n, mean = c(l, rep(0, p - 1)),
                                    sigma = diag(rep(1, p)))
    samp[, , j] <- samp[, , j] / sqrt(rowSums(samp[, , j]^2))

  }
  return(samp)

}
pow <- unif_stat_MC(n = 30, type = "all", p = 3, M = 1e2, cores = 2,
                    chunks = 5, r_H1 = r_H1, crit_val = sim$crit_val_MC)
pow$power_MC

unif_stat_MC can be used for constructing the Monte Carlo calibration necessary for unif_test, either for providing a rejection rule based on $crit_val_MC or for approximating the $p$-value by $stats_MC.

# Using precomputed critical values
ht1 <- unif_test(data = samps_sph[, , 1], type = "Rayleigh",
                 p_value = "crit_val", crit_val = sim$crit_val_MC)
ht1
ht1$reject

# Using precomputed Monte Carlo statistics
ht2 <- unif_test(data = samps_sph[, , 1], type = "Rayleigh",
                 p_value = "MC", stats_MC = sim$stats_MC)
ht2
ht2$reject

# Faster than
unif_test(data = samps_sph[, , 1], type = "Rayleigh", p_value = "MC")