unif_stat_MC: Monte Carlo simulation of circular and (hyper)spherical...

View source: R/unif_stat_MC.R

unif_stat_MCR Documentation

Monte Carlo simulation of circular and (hyper)spherical uniformity statistics

Description

Utility for performing Monte Carlo simulation of several statistics for assessing uniformity on the (hyper)sphere S^{p-1}:=\{{\bf x}\in R^p:||{\bf x}||=1\}, p\ge 2.

unif_stat_MC provides a convenient wrapper for parallel evaluation of unif_stat, the estimation of critical values under the null distribution, and the computation of empirical powers under the alternative.

Usage

unif_stat_MC(n, type = "all", p, M = 10000, r_H1 = NULL,
  crit_val = NULL, alpha = c(0.1, 0.05, 0.01), return_stats = TRUE,
  stats_sorted = FALSE, chunks = ceiling((n * M)/1e+05), cores = 1,
  seeds = NULL, CCF09_dirs = NULL, CJ12_reg = 3, cov_a = 2 * pi,
  Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5,
  Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0,
  1), Softmax_kappa = 1, Stereo_a = 0, ...)

Arguments

n

sample size.

type

type of test to be applied. A character vector containing any of the following types of tests, depending on the dimension p:

  • Circular data: any of the names available at object avail_cir_tests.

  • (Hyper)spherical data: any of the names available at object avail_sph_tests.

If type = "all" (default), then type is set as avail_cir_tests or avail_sph_tests, depending on the value of p.

p

integer giving the dimension of the ambient space R^p that contains S^{p-1}.

M

number of Monte Carlo replications. Defaults to 1e4.

r_H1

if provided, the computation of empirical powers is carried out for the alternative hypothesis sampled with r_H1. This must be a function with the same arguments and value as r_unif_sph (see examples). Defaults to NULL, indicating that the critical values are estimated from samples of r_unif_sph.

crit_val

if provided, must be the critical values as returned by $stats_MC in a call to unif_stat_MC. They are used for computing the empirical powers of the tests present in type. Defaults to NULL, which means that no power computation is done.

alpha

vector with significance levels. Defaults to c(0.10, 0.05, 0.01).

return_stats

return the Monte Carlo statistics? If only the critical values or powers are desired, FALSE saves memory in the returned object. Defaults to TRUE.

stats_sorted

sort the returned Monte Carlo statistics? If TRUE, this is useful for evaluating faster the empirical cumulative distribution function when approximating the distribution in unif_stat_distr. Defaults to FALSE.

chunks

number of chunks to split the M Monte Carlo replications. Useful for parallelizing the simulation study in chunks tasks containing ceiling(M / chunks) replications. Useful also for avoiding memory bottlenecks when M is large. Defaults to
ceiling((n * M) / 1e5).

cores

number of cores to perform the simulation. Defaults to 1.

seeds

if provided, a vector of size chunks for fixing the seeds on each of the simulation chunks (useful for reproducing parallel simulations). Specifically, for k in 1:chunks, seeds are set as set.seed(seeds[k], kind = "Mersenne-Twister") in each chunk. Defaults to NULL (no seed setting is done).

CCF09_dirs

a matrix of size c(n_proj, p) containing n_proj random directions (in Cartesian coordinates) on S^{p-1} to perform the CCF09 test. If NULL (default), a sample of size n_proj = 50 directions is computed internally.

CJ12_reg

type of asymptotic regime for CJ12 test, either 1 (sub-exponential regime), 2 (exponential), or 3 (super-exponential; default).

cov_a

a_n = a / n parameter used in the length of the arcs of the coverage-based tests. Must be positive. Defaults to 2 * pi.

Cressie_t

t parameter for the Cressie test, a real in (0, 1). Defaults to 1 / 3.

K_CCF09

integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to 25.

Poisson_rho

\rho parameter for the Poisson test, a real in [0, 1). Defaults to 0.5.

Pycke_q

q parameter for the Pycke "q-test", a real in (0, 1). Defaults to 1 / 2.

Rayleigh_m

integer m for the m-modal Rayleigh test. Defaults to m = 1 (the standard Rayleigh test).

Riesz_s

s parameter for the s-Riesz test, a real in (0, 2). Defaults to 1.

Rothman_t

t parameter for the Rothman test, a real in (0, 1). Defaults to 1 / 3.

Sobolev_vk2

weights for the finite Sobolev test. A non-negative vector or matrix. Defaults to c(0, 0, 1).

Softmax_kappa

\kappa parameter for the Softmax test, a non-negative real. Defaults to 1.

Stereo_a

a parameter for the Stereo test, a real in [-1, 1]. Defaults to 0.

...

optional arguments to be passed to the r_H1 sampler or to foreach (for example, .export to export global variables or other functions to the foreach environment).

Details

It is possible to have a progress bar if unif_stat_MC is wrapped with progressr::with_progress or if progressr::handlers(global = TRUE) is invoked (once) by the user. See the examples below. The progress bar is updated with the number of finished chunks.

All the tests reject for large values of the test statistic (max_gap = TRUE is assumed for the Range test), so the critical values for the significance levels alpha correspond to the alpha-upper quantiles of the null distribution of the test statistic.

The Monte Carlo simulation for the CCF09 test is made conditionally on the choice of CCF09_dirs. That is, all the Monte Carlo statistics share the same random directions.

Except for CCF09_dirs, K_CCF09, and CJ12_reg, all the test-specific parameters are vectorized.

Value

A list with the following entries:

  • crit_val_MC: a data frame of size c(length(alpha), length(type)), with column names given by type and rows corresponding to the significance levels alpha, that contains the estimated critical values of the tests.

  • power_MC: a data frame of size c(nrow(crit_val), length(type)), with column names given by type and rows corresponding to the significance levels of crit_val, that contains the empirical powers of the tests. NA if crit_val = NULL.

  • stats_MC: a data frame of size c(M, length(type)), with column names given by type, that contains the Monte Carlo statistics.

Examples

## Critical values

# Single statistic, specific alpha
cir <- unif_stat_MC(n = 10, M = 1e2, type = "Ajne", p = 2, alpha = 0.15)
summary(cir$stats_MC)
cir$crit_val_MC

# All circular statistics
cir <- unif_stat_MC(n = 10, M = 1e2, p = 2)
head(cir$stats_MC)
cir$crit_val_MC

# All spherical statistics
sph <- unif_stat_MC(n = 10, M = 1e2, p = 3)
head(sph$stats_MC)
sph$crit_val_MC

## Using a progress bar

# Define a progress bar
require(progress)
require(progressr)
handlers(handler_progress(
  format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:",
                 ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"),
  clear = FALSE))

# Call unif_stat_MC() within with_progress()
with_progress(unif_stat_MC(n = 10, M = 1e2, p = 3, chunks = 10))

# With several cores
with_progress(unif_stat_MC(n = 10, M = 1e2, p = 3, chunks = 10, cores = 2))

# Instead of using with_progress() each time, it is more practical to run
# handlers(global = TRUE)
# once to activate progress bars in your R session

## Power computation

# Single statistic
cir_pow <- unif_stat_MC(n = 10, M = 1e2, type = "Ajne", p = 2,
                        crit_val = cir$crit_val_MC)
cir_pow$crit_val_MC
cir_pow$power_MC

# All circular statistics
cir_pow <- unif_stat_MC(n = 10, M = 1e2, p = 2, crit_val = cir$crit_val_MC)
cir_pow$crit_val_MC
cir_pow$power_MC

# All spherical statistics
sph_pow <- unif_stat_MC(n = 10, M = 1e2, p = 3, crit_val = sph$crit_val_MC)
sph_pow$crit_val_MC
sph_pow$power_MC

## Custom r_H1

# Circular
r_H1 <- function(n, p, M, l = 0.05) {

  stopifnot(p == 2)
  Theta_to_X(matrix(runif(n * M, 0, (2 - l) * pi), n, M))

}
dirs <- r_unif_sph(n = 5, p = 2, M = 1)[, , 1]
cir <- unif_stat_MC(n = 50, M = 1e2, p = 2, CCF09_dirs = dirs)
cir_pow <- unif_stat_MC(n = 50, M = 1e2, p = 2, r_H1 = r_H1, l = 0.10,
                        crit_val = cir$crit_val_MC, CCF09_dirs = dirs)
cir_pow$crit_val_MC
cir_pow$power_MC

# Spherical
r_H1 <- function(n, p, M, l = 0.5) {

  samp <- array(dim = c(n, p, M))
  for (j in 1:M) {

    samp[, , j] <- mvtnorm::rmvnorm(n = n, mean = c(l, rep(0, p - 1)),
                                    sigma = diag(rep(1, p)))
    samp[, , j] <- samp[, , j] / sqrt(rowSums(samp[, , j]^2))

  }
  return(samp)

}
dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1]
sph <- unif_stat_MC(n = 50, M = 1e2, p = 3, CCF09_dirs = dirs)
sph_pow <- unif_stat_MC(n = 50, M = 1e2, p = 3, r_H1 = r_H1, l = 0.5,
                       crit_val = sph$crit_val_MC, CCF09_dirs = dirs)
sph_pow$power_MC

## Pre-built r_H1

# Circular
dirs <- r_unif_sph(n = 5, p = 2, M = 1)[, , 1]
cir_pow <- unif_stat_MC(n = 50, M = 1e2, p = 2, r_H1 = r_alt, alt = "vMF",
                        kappa = 1, crit_val = cir$crit_val_MC,
                        CCF09_dirs = dirs)
cir_pow$power_MC

# Spherical
dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1]
sph_pow <- unif_stat_MC(n = 50, M = 1e2, p = 3, r_H1 = r_alt, alt = "vMF",
                        kappa = 1, crit_val = sph$crit_val_MC,
                        CCF09_dirs = dirs)
sph_pow$power_MC


sphunif documentation built on May 29, 2024, 4:19 a.m.