SOAs: Creation of Stratum Orthogonal Arrays

Documented in fastSP

### Obtained from Hongquan Xu
### Modified by Ulrike Groemping
### general unmentioned changes: assignments from = to <-,
###                              opening { directly after ) for functions
###                              tab --> two blanks

## SP_Enumerator.R : Supplemental R codes and algorithms for
## Tian, Y. and Xu, H. (2023). Stratification Pattern Enumerator and its Applications.
## Journal of the Royal Statistical Society, Series B.
##
## Fast algorithms to compute space-filling (or stratification) pattern (SPattern).
##
## key functions: EDy, EDz, fastSP, fastSP.K
##    EDy and EDz: Stratification Pattern Enumerator using definition and Lemma 1.
##    fastSP and fastSP.K: fast algorithms for computing SPattern based on Theorems 3 and 4.
## uses ff (instead of full.fd)
##
## can and should fastSP and fastSP.K be combined into one function?
## why does one use soa.kernel and the other Rd.kernel?
##
## Date original: Feb/10/2023
## Date modified: Oct/27/2023 ff

#' Functions for fast calculation of stratification pattern
#' according to Tian and Xu 2023
#' @name fastSP
#' @export
#'
#' @param D design with number of levels a power of \code{s}
#' @param s prime or prime power on which \code{D} is based
#' @param maxwt integer number; maximum weight for which the pattern is to be calculated
#' @param y0 small number that drives accuracy. The default (\code{NULL}) uses 1/\code{s}
#' for maximum \code{K} and \code{y0}=0.1 for smaller values of \code{K}.
#' See the Details section for a discussion of this parameter.
#' @param K integer number of summands. Can also be \code{"max"} for indicating that
#' all summands are requested (Theorem 3 of Tian and Xu 2023+). In Theorem 4 of
#' Tian and Xu (2023+), larger \code{K} provide better accuracy.
#' If \code{maxwt=NULL}, the default (\code{NULL}) is that all summands
#' are requested (i.e., Theorem 3 of Tian and Xu 2023+). Otherwise,
#' the default is the maximum of \code{maxwt} and a default based
#' on \code{y0} according to a formula
#' by Tian and Xu (2023+) (yields smaller \code{K} for smaller \code{y0}).
#' @param tol tolerance for checking whether the imaginary part is zero
#'
#' @details
#' The function was modified from the code provided with
#' Tian and Xu (2023+).
#'
#' Per default (\code{maxwt=NULL} and \code{K=NULL}), or when the user chooses
#' \code{K="max"} in spite of specifying a value for \code{maxwt},
#' \code{fastSP} calculates the entire stratification pattern and uses all summands
#' of Tian and Xu's (2023+) Equation (4) of their Theorem 3,\cr
#' with one important modification: y_j is not chosen as the j-th power of
#' the m*el-th complex rooth of the unity but as that complex number divided
#' by \code{s}, as this appears to yield numerically stabler results.
#' (It neutralizes the large powers of \code{s} that arise by multiplying
#' the weighted similarities of Lemma 1 for obtaining the summands in E(D,y)
#' in formula (3)).
#' Limited experimentation with different values of \code{s} showed
#' that this approach did indeed yield reasonably stable results, for example
#' for the SOA(64,20,8,3) for which the version with unmodified powers of
#' roots of the unity ran into problems.\cr
#' It is straightforward to verify that Theorem 3 remains valid for
#' this modified choice of y_j.
#'
#' It is possible and often advisable to calculate only a smaller number of
#' entries, for saving resources and also because the later entries are less
#' interesting and less accurate. If the argument \code{maxwt} is specified
#' and \code{K} is left unspecified (\code{K=NULL}), \code{K} is calculated as
#' the maximum of Tian and Xu's (2023+) proposed default for their
#' approximation formula (6), depending on \code{y0}, which is set to 0.1,
#' if also unspecified. Tian and Xu (2023+) recommended to use \code{y0} values
#' between 0.001 and 0.1, when using formula (6) of Theorem 4.
#'
#' For obtaining the original behavior of Tian and Xu's (2023+) implementation
#' of their Theorem 3 (not desirable for larger situations),
#' choose \code{y0=1}. Note that \code{y0=1} with a specified \code{maxwt} and
#' \code{K=NULL} yields an error, because the default formula for \code{K}
#' does not work for \code{y0=1}.
#'
#' Also consider the Note section regarding numerical considerations.
#'
#' @returns an object of class \code{fastSP}, with attributes \code{call},
#' \code{K}, \code{y0}, and possibly \code{message}.
#' The object itself is a stratification pattern or the first
#' \code{maxwt} elements of the stratification pattern (default: all elements).\cr
#' If \code{K} is less than the maximum length of the stratification pattern
#' (\code{Kmax} element of attribute \code{K})
#' the returned values are approximations (more accurate for larger \code{K}).
#' If the object has a \code{message} attribute, this attribute indicates
#' which positions of the pattern must be considered as problematic because
#' the imaginary part was non-zero.
#'
#' @note Even the exact pattern (obtained with maximum \code{K}) must be considered
#' with caution because of potential numerical problems. Often, the creation process
#' of a GSOA implies that the first few elements are zeroes. If this is the case, the
#' degree of inaccuracy may be assessed from these elements. Furthermore, warnings of
#' non-zero imaginary parts indicate similar problems. If unsure about the accuracy, it
#' may also be an option to use function \code{\link{Spattern}} with a small \code{maxwt}
#' argument (for resource reasons) in order to obtain exact values for the first very few
#' entries of the stratification pattern.aus
#'
#' @references
#' For full detail, see \code{\link{SOAs-package}}.
#'
#' Tian, Y. and Xu, H. (2023+)
#'
#' @examples
#' ## SOA(32,9,8,3) from Shi and Tang (2020)
#' soa32x9 <- t(matrix(c(7,3,6,2,7,3,6,2,4,0,5,1,4,0,5,1,5,1,4,0,5,1,4,0,6,2,7,3,6,2,7,3,
#'                   7,7,2,2,5,5,0,0,6,6,3,3,4,4,1,1,5,5,0,0,7,7,2,2,4,4,1,1,6,6,3,3,
#'                   7,5,6,4,3,1,2,0,4,6,5,7,0,2,1,3,7,5,6,4,3,1,2,0,4,6,5,7,0,2,1,3,
#'                   7,7,4,4,5,5,6,6,2,2,1,1,0,0,3,3,7,7,4,4,5,5,6,6,2,2,1,1,0,0,3,3,
#'                   7,5,6,4,5,7,4,6,6,4,7,5,4,6,5,7,3,1,2,0,1,3,0,2,2,0,3,1,0,2,1,3,
#'                   7,1,0,6,3,5,4,2,4,2,3,5,0,6,7,1,5,3,2,4,1,7,6,0,6,0,1,7,2,4,5,3,
#'                   7,1,2,4,7,1,2,4,2,4,7,1,2,4,7,1,5,3,0,6,5,3,0,6,0,6,5,3,0,6,5,3,
#'                   7,3,2,6,5,1,0,4,4,0,1,5,6,2,3,7,3,7,6,2,1,5,4,0,0,4,5,1,2,6,7,3,
#'                   7,1,4,2,3,5,0,6,2,4,1,7,6,0,5,3,3,5,0,6,7,1,4,2,6,0,5,3,2,4,1,7
#' ), nrow=9, byrow = TRUE))
#'
#' ## complete pattern according to theorem 3
#' ## (y0=1/2 or y0=1 does not make a difference for this small example)
#' a <- fastSP(soa32x9, 2); round(a,7)
#' a <- fastSP(soa32x9, 2, y0=1); round(a,7)
#'
#' ## only the first five positions (K=9 calculated and used based on y0=0.1)
#' ## not very accurate
#' a <- fastSP(soa32x9, 2, maxwt=5); round(a,7)
#' ## more accurate (K=5 used, based on y0=0.01)
#' a <- fastSP(soa32x9, 2, maxwt=5, y=0.01); round(a,7)
#' ## even more accurate (K=9 used with y0=0.01)
#' a <- fastSP(soa32x9, 2, maxwt=5, y=0.01, K=9); round(a,7)
#'

#' # example code
#'
fastSP <- function(D, s, maxwt=NULL, K=NULL, y0=NULL, tol=0.00001){
  # compute SPattern using EDy and complex numbers by definition
  # see Theorem 3 of Tian and Xu (2023).
  # D is a GSOA with s^el levels
  # s is the base and el is the length
  # maxwt is the number of components of SPattern to be computed
  # when el=1, it returns the usual GWLP as in Xu and Wu (2001)
   mycall <- sys.call()
   x <- as.matrix(D) # treat a vector or data.frame as a matrix
   m <- ncol(x)
   el <- round(log(max(x)+1, base=s))
   if (is.null(K) && is.null(maxwt)) K <- m*el # length of spattern
   if (!is.null(K)) if (K=="max") K <- m*el
   if (is.null(maxwt)) maxwt <- K
          ## if both K and maxwt were NULL, they are both m*el now
          ## maxwt is non-NULL now, K is numeric or NULL
   if (is.null(K) && is.null(y0)) y0 <- 0.1
   ## now the null y0 with known K
   if (is.null(y0)) y0 <- ifelse(K==m*el, 1/s, 0.1)
   stopifnot(maxwt>=1 && maxwt <= m*el)
   stopifnot(maxwt%%1==0)
   stopifnot(y0<=1 && y0>0)
   if (is.null(K)){
     if(y0==1) stop("y0=1 is not possible with non-NULL maxwt and K=NULL")
     K <- min(m*el,
          max(maxwt, ceiling((log(0.01) - log(s^(el*m)/nrow(x)-1))/log(y0))))
   }
   ## introduced default K ## UG 10 Nov 2023
   stopifnot(K >= maxwt)
   stopifnot(K%%1==0)
   stopifnot(K <= m*el)
#   omegaK <- as.complex(exp(1i*2*pi/K))  # omegaK is a Kth root of 1,
#   omegaK^K=1
   z <- as.complex(exp(1i*2*(1:K)*pi/K))
   bz <- Ez <- complex(K)
   ## introduced y0, in analogy to fastSP.K    # UG 10 Nov 2023
   ## in order to improve accuracy of calculations
   for(j in 1:K) Ez[j] <- EDy(x, s, z[j]*y0) - 1
   for(i in 1:maxwt){
       ij <- (i*(1:K)) %% K
       bz[i] <- sum(z[K-ij] * Ez)/K/exp(i*log(y0))
   }
   msg <- NULL
   if (max(abs(Im(bz))) > tol){
       ## obtain minimum position for which deviation is larger than tol
       poscrit <- min(which(abs(Im(bz)) > tol))
       if (poscrit <= maxwt){
         msg <- paste("Imaginary part exceeds tolerance for positions >=", poscrit)
       message(msg)
       }
       }
   aus <- Re(bz)[1:maxwt]  # Im(bz) should be zero
   names(aus) <- 1:maxwt
   attr(aus, "call") <- mycall
   attr(aus, "K") <- c(K=K, Kmax=m*el)
   attr(aus, "y0") <- y0
   attr(aus, "message") <- msg
   class(aus) <- "fastSP"
   aus
}