eigvaldisp: Statistics of Eigenvalue Dispersion Indices

Documented in GenCov

##### GenCov #####
#' Generate covariance matrix of known structure
#'
#' This function generates a (population) covariance/correlation matrix with
#' known eigenstructure, which is specified by the number of variables \code{p},
#' \code{shape}, and (when applicable) relative eigenvalue variance \code{VR}.
#' Eigenvector matrix can be the identity matrix, a random orthonormal matrix,
#' or constructed to yield a correlation matrix
#' (when \code{evectors = "MAP"} or \code{"Givens"}).
#' Alternatively, eigenvalues \code{evalues} and eigenvectors \code{evectors}
#' can directly be specified.
#'
#' Either \code{p} or \code{evalues} should be specified.
#' A straightforward use is to provide a numeric vector of eigenvalues
#' with the argument \code{evalues}. If this argument is missing,
#' then eigenvalues are generated according to \code{p},
#' \code{VR}, \code{shape}, and \code{q}.
#' When \code{shape = "q-large"}, the first \eqn{q} eigenvalues are set
#' equally large, with the rest \eqn{p - q} equally small.
#' Specifically, the larger ones are
#' \eqn{1 + \sqrt{V_{\mathrm{rel}}} \cdot \sqrt{(p - 1)(p - q) / q}}{1 + \sqrt{Vrel} * \sqrt((p - 1) * (p - q) / q)} and the smaller ones are
#' \eqn{1 - \sqrt{V_{\mathrm{rel}}} \cdot \sqrt{q (p - 1) / (p - q)}}{1 - \sqrt(VR) * \sqrt(q * (p - 1) / (p - q))}.
#' The combination of \code{p}, \code{q}, and \code{VR} should be such that
#' all resultant eigenvalues are nonnegative; an error is returned otherwise.
#' Alternatively, gradually decreasing eigenvalues can be generated with
#' \code{shape = "exponentially_decreasing"}, \code{"linearly_decreasing"},
#' or \code{"quadratically_decreasing"}.
#' For the exponential decrease (e.g., Kirkpatrick 2009), eigenvalues decrease
#' in magnitude at a constant rate \code{r}, which is either user-specified
#' (default \code{0.5}) or numerically optimized to yield desired \code{VR}
#' when provided.
#' For the other options, see Watanabe (2022) for details.
#'
#' Unless \code{evalues} is specified, the trace of the resultant matrix
#' equals to \eqn{p} by default. The argument \code{scale.} can be used to
#' **force** scale the overall matrix, so that the trace equals this value.
#' This scaling is applied just before the outcome is returned, so
#' will not preserve absolute magnitudes of \code{evalues} even when
#' this is specified (relative magnitudes are preserved).
#'
#' A correlation matrix with known eigenvalues can be generated by
#' choosing \code{evectors = "MAP"} or \code{"Givens"}.
#' These will construct an appropriate eigenvector matrix so that the outcome
#' is a correlation matrix (Davies and Higham, 2000; Waller, 2020).
#' User-specified eigenvalues are scaled to sum to \eqn{p} in these cases.
#' Note that, when \code{scale.} is specified as well, the outcome will be
#' a matrix proportional to a correlation matrix
#' (i.e., not a correlation matrix unless \code{scale. == p}).
#'
#' The original algorithm of Davies and Higham (2020) involves a random choice
#' between two possible rotation angles, but this randomness is omitted
#' (the angle with the larger tangent is always chosen) in
#' the present implementation in favor of speed and reproducibility.
#' As such, the present implementation of \code{"Givens"} slightly deviates
#' from the original algorithm.
#'
#' ### Copyright notice
#' The \code{"MAP"} algorithm originally derived
#' from \code{rMAP()} of the package
#' \code{fungible} version 1.99 (Waller, 2021), which is under GPL (>= 2).
#' (Strictly speaking, this was first adopted from the supplemental material
#' of Waller \[2020\], which is licensed under CC-BY-4.0.)
#' The adopted block is indicated in the source code.
#' The \code{"Givens"} algorithm used to be adopted from
#' \code{fungible::rGivens()} in an earlier version of this package,
#' but has essentially been rewritten by the present author
#' based on Davies and Higham (2000) with modifications.
#'
#' @param p
#'   Number of variables.
#' @param VR
#'   The relative eigenvalue variance \eqn{V_\mathrm{rel}}{Vrel} of the generated matrix;
#'   used only when \code{shape = "q-large"} or \code{"elongated"}.
#' @param scale.
#'   When provided, the outcome is proportionally scaled
#'   to have the trace equal to this value.
#' @param shape
#'   Option to specify the \dQuote{shape} of the covariance matrix;
#'   ignored when \code{evalues} is provided.  Options allowed are:
#'   \describe{
#'     \item{\code{"q-large"}}{The \eqn{q} leading eigenvalues equally large,
#'                      the rest \eqn{p - q} equally small}
#'     \item{\code{"elongate"}}{Special case of \code{"q-large"} with \code{q = 1}}
#'     \item{\code{"exponentially_decreasing"}}{Eigenvalues exponentially decreasing
#'                                       either at given rate \code{r} (default) or
#'                                       to yield desired \code{VR}}
#'     \item{\code{"linearly_decreasing"}}{Eigenvalues linearly decreasing
#'                                  in magnitude}
#'     \item{\code{"quadratically_decreasing"}}{Eigenvalues quadratically decreasing}
#'   }
#'   User-specified \code{VR} is ignored in the last two options.
#'   See Details.
#' @param q
#'   The number of large eigenvalues. To be used with \code{shape = "q-large"}.
#' @param r
#'   The rate of exponential decreasing of eigenvalues; (\eqn{i + 1})-th
#'   eigenvalue is \eqn{r} times \eqn{i}th eigenvalue.
#'   To be used with \code{shape = "exponentially_decreasing"}.
#'   Ignored when \code{VR} is provided.
#' @param evalues
#'   Numeric vector of eigenvalues. If missing, eigenvalues are automatically
#'   generated according to \code{p}, \code{VR}, \code{shape}, and \code{q}.
#' @param evectors
#'   Option to specify how eigenvectors are generated:
#'   \describe{
#'     \item{\code{"plain"}}{Eigenvectors are taken as the identity matrix of
#'       order \eqn{p}.}
#'     \item{\code{"random"}}{Random eigenvectors are generated by QR decomposition of
#'       a square matrix of standard normal variates; i.e., sampling from
#'       the Haar invariant distribution (ignoring the signs of diagonals).}
#'     \item{\code{"MAP"}}{Eigenvectors are constructed so that the resultant matrix
#'       is a correlation matrix with the specified eigenvalues
#'       (scaled so that the trace equals \eqn{p}); based on the iterative
#'       algorithm of Waller (2020). Modified from \code{fungible::rMAP()}.}
#'     \item{\code{"Givens"}}{Similar to \code{"MAP"}, but with an algorithm adopted
#'       from Davies and Higham (2000) with modifications;
#'       this is much faster and is guaranteed to converge.}
#'   }
#'   Alternatively, a matrix of eigenvectors can be provided to specify
#'   full or partial eigenvectors for the outcome matrix.
#'   These are converted to be orthonormal;
#'   randomly generated vectors are appended as necessary.
#'   Remember the ambiguity of eigenvectors' polarities (sign-swapping).
#' @param random_rotation
#'   Only relevant when \code{evectors = "Givens"};
#'   logical to specify whether the initial diagonal matrix of eigenvalues
#'   is randomly rotated or not. For reproducibility in simulation studies.
#' @param which_rotate
#'   When \code{evectors = "Givens"}, specify which (or in what order)
#'   columns/rows are subject to Givens rotation:
#'   \describe{
#'     \item{\code{"random"}}{Columns/rows are chosen in random order.
#'       This conforms with the original algorithm of Davies and Higham (2000).}
#'     \item{\code{"head"}/\code{"tail"}}{The first/last ones among candidates are rotated.
#'       For reproducibility in simulation studies.}
#'     \item{\code{"minmax"}}{Columns/rows corresponding to a minimum and maximum
#'       of the diagonal elements are rotated. This is slightly faster but
#'       presumably produces strong, isolated correlations.}
#'   }
#' @param tol
#'   Only relevant when \code{evectors = "MAP"}, or
#'   \code{shape = "exponentially_decreasing"} and \code{VR} is provided;
#'   numeric to specify the tolerance against which convergence of
#'   eigenvectors or \code{VR}, respectively, is evaluated.
#' @param maxiter
#'   Only relevant when \code{evectors = "MAP"}; maximum number of iterations.
#' @param seed
#'   Random seed passed to \code{set.seed(seed)}
#'   before random eigenvectors are generated.
#' @param check
#'   Logical, whether to check and ensure all eigenvalues of the
#'   resultant matrix are nonnegative. By default, automatically turned on
#'   when any of the specified/generated eigenvalues are equal to 0.
#'
#' @return A \code{p * p} numeric matrix with the specified eigenstructure
#'
#' @references
#' Davies, P. I. and Higham, N. J. (2000) Numerically stable generation of
#'  correlation matrices and their factors.
#'  *BIT Numerical Mathematics* **40**, 640--651.
#'  \doi{10.1023/A:1022384216930}.
#'
#' Kirkpatrick, M. (2009) Patterns of quantitative genetic variation in
#'  multiple dimensions. *Genetica*, **136**, 271--284.
#'  \doi{10.1007/s10709-008-9302-6}.
#'
#' Waller, N. G. (2020) Generating correlation matrices with specified
#'  eigenvalues using the method of alternating projections.
#'  *American Statistician* **74**, 21--28.
#'  \doi{10.1080/00031305.2017.1401960}.
#'
#' Waller, N. G. (2021) fungible: psychometric functions from the Waller Lab.
#'  R package version 1.99.
#'  [https://CRAN.R-project.org/package=fungible](https://CRAN.R-project.org/package=fungible).
#'
#' Watanabe, J. (2022) Statistics of eigenvalue dispersion indices:
#'  quantifying the magnitude of phenotypic integration. *Evolution*,
#'  **76**, 4--28. \doi{10.1111/evo.14382}.
#'
#' @importFrom stats rnorm runif optimize
#'
#' @export
#'
#' @examples
#' # Covariance matrix with no covariance (identity matrix)
#' GenCov(4, 0)
#'
#' # Relative eigenvalue variance = 0.25, with 1 and 2 large eigenvalues
#' GenCov(4, 0.5 ^ 2)
#' GenCov(4, 0.5 ^ 2, q = 2)
#'
#' # Relative eigenvalue variance = 0.25, with 1 and 2 large eigenvalues
#' GenCov(evalues = c(3, 2, 1, 1))
#'
#' # Exponentially decreasing eigenvalues with rate r (default 0.5)
#' GenCov(4, shape = "exponentially_decreasing", r = 0.5)
#'
#' # Exponentially decreasing eigenvalues with desired VR; r is ignored
#' GenCov(4, VR = 0.5 ^ 2, shape = "exponentially_decreasing")
#'
#' # Linearly decreasing eigenvalues; VR, q, and evalues are ignored
#' GenCov(4, shape = "linearly_decreasing")
#'
#' # With random orthonormal eigenvector matrix
#' GenCov(4, 0.5 ^ 2, evectors = "random")
#'
#' # With user-specified (partial) eigenvector matrix
#' GenCov(4, 0.5 ^ 2, evectors = rep.int(1 / sqrt(4), 4))
#'
#' # Correlation matrix with two different algorithms
#' GenCov(4, 0.5 ^ 2, q = 2, evectors = "MAP")
#' GenCov(4, 0.5 ^ 2, q = 2, evectors = "Givens")
#'
#' # Note that, by default, results of the above algorithms usually involve
#' # random fluctuation. To eliminate this, use either:
#' GenCov(4, 0.5 ^ 2, q = 2, evectors = "Givens", random_rotation = FALSE)
#'                               # Optionally with which_rotate = "head"/"tail"
#' GenCov(4, 0.5 ^ 2, q = 2, evectors = "Givens", seed = 1234)
#'                               # Arbitrary random seed
#' # The first way is not available for "MAP", but the second one is.
#'
GenCov <- function(p = length(evalues), VR = 0.5, scale. = NULL,
                   evalues = "auto", evectors = "plain",
                   shape = c("q-large", "elongate", "exponentially_decreasing",
                             "linearly_decreasing", "quadratically_decreasing"),
                   q = 1, r = 0.5,
                   random_rotation = TRUE,
                   which_rotate = c("random", "head", "tail", "minmax"),
                   tol = 1e-12, maxiter = 5000L, seed = NULL,
                   check = any(evalues == 0)) {
    if(missing(p) && missing(evalues)) stop("Provide either p or evalues")
    shape <- match.arg(shape)
    if(missing(which_rotate) && !random_rotation) {
        which_rotate <- "head"
    } else {
        which_rotate <- match.arg(which_rotate)
    }
    if(!is.numeric(evectors)) {
        evectors <- match.arg(evectors, c("plain", "random", "MAP", "Givens"))
    }
    ## Function to scale columns of matrix into unit length
    scale_unit <- function(X) {
       # apply(X, 2, function(x) x / drop(sqrt(crossprod(x))))
        sweep(X, 2, sqrt(diag(crossprod(X))), "/")
    }
    ## Function to cbind two matrices X and Y after ensuring columns of Y are
    ## orthogonal to those of X (which are assumed to be of unit length)
    cbind_orth <- function(X, Y) {
        Y <- crossprod(diag(p) - tcrossprod(X), Y)
        Y <- scale_unit(Y)
        cbind(X, Y)
    }
    ## Function to construct Cov matrix from a matrix of eigenvectors and
    ## a vector of eigenvalues
    S_fromUL <- function(evec = evec, evalues = evalues) {
        te <- t(evec)
        crossprod(te * evalues, te)
    }
    ## Function to get eigenvalues when shape = "exponentially_decreasing"
    get_evalues_exd <- function(r, p = p) {
        p * (1 - r) / (1 - r ^ p) * (r ^ seq.int(0, p - 1))
    }
    ## Objective function for optimizing r to yield desired VR
    ## when shape == "exponentially_decreasing" and VR is provided
    objfun_r_evalues <- function(r, VR = VR, p = p) {
        evalues <- get_evalues_exd(r, p)
        (VR - VE(L = evalues)$VR) ^ 2
    }
    ## How to choose rows and columns to rotate, when evectors = "Givens"
    sample_i <- switch(which_rotate,
                       minmax = which.min,
                       random = function(a) {
                           x <- which(a < 1)
                           x[ceiling(runif(1) * length(x))]
                           # x[sample.int(length(x), 1)]
                       },
                       head   = function(a) which(a < 1)[1],
                       tail   = function(a) {x <- which(a < 1); x[length(x)]})
    sample_j <- switch(which_rotate,
                       minmax = which.max,
                       random = function(a) {
                           x <- which(a > 1);
                           x[ceiling(runif(1) * length(x))]
                           # x[sample.int(length(x), 1)]
                       },
                       head   = function(a) which(a > 1)[1],
                       tail   = function(a) {x <- which(a > 1); x[length(x)]})
    # ## Modified sign() to return 1 for 0
    # signp <- function(x) {
    #     ans <- sign(x)
    #     ans[ans == 0] <- 1
    #     return(ans)
    # }
    ## Get appropriate Givens rotation matrix
    get_R <- function(aii, ajj, aij) {
        T <- (aij + sqrt(aij ^ 2 - (aii - 1) * (ajj - 1))) / (ajj - 1)
        C <- 1 / sqrt(1 + T ^ 2)
        S <- T * C
        # T <- atan((aij + sqrt(aij ^ 2 - (aii - 1) * (ajj - 1))) / (ajj - 1))
        # C <- cos(T)
        # S <- sin(T)
        # R <- matrix(c(C, -S, S, C), 2, 2)
        R <- matrix(c(C, S, -S, C), 2, 2)
    }
    ## Construct eigenvalues; by default, the eigenvalues sum to p
    ## [changed from Watanabe (2022), where they sum to p / sqrt(p - 1)]
    if(missing(evalues)) {
        evalues <- numeric(p)
        if(shape == "elongate") {
            if(!missing(q)) {
                warning("user-specified q is ignored for shape = \"elongate\"")
            }
            q <- 1
            shape <- "q-large"
        }
        if(shape == "q-large") {
            if(q > p) {
                stop("q should not exceed p")
            } else if(q == p) {
                evalues[1:p] <- 1
            } else {
                sqVR <- sqrt(VR)
                evalues[1:q] <- rep.int(1 + sqVR * sqrt((p - 1) * (p - q) / q),
                                        q) # / sqrt(p - 1)
                evalues[(q + 1):p] <- rep.int(1 - sqVR * sqrt(q * (p - 1) /
                                              (p - q)), p - q) # / sqrt(p - 1)
                if(any(evalues < 0)) {
                    stop("Negative eigenvalues. ",
                         "Ensure VR < (p - q) / ((p - 1) * q)")
                }
            }
        } else if (shape == "exponentially_decreasing") {
            if(!missing(VR)) {
                if(!missing(r)) {
                    warning("User-specified r was ignored as VR was provided")
                }
                r <- optimize(objfun_r_evalues, c(0, 1), VR = VR, p = p,
                              tol = tol)$minimum
            }
            evalues <- get_evalues_exd(r, p = p)
        } else {
            if(!missing(VR)) {
                warning("In the selected method, VR is fixed for a given p.\n",
                        "  User-defined VR was ignored.")
            }
            if(shape == "linearly_decreasing") {
                evalues <- 2 * (p + 1 - seq_len(p)) / (p + 1) # / sqrt(p - 1)
            } else if(shape == "quadratically_decreasing") {
                evalues <- (6 * (p + 1 - seq_len(p)) ^ 2) /
                           ((p + 1) * (2 * p + 1)) # / sqrt(p - 1)
            }
        }
    }
    if(any(evalues < 0)) stop("Provide a vector of nonnegative eigenvalues")
    ## For a correlation matrix, eigenvalues are scaled to sum to p
    if(evectors[1] == "MAP" || evectors[1] == "Givens") {
        sum.eval <- sum(evalues)
        if(!isTRUE(all.equal(sum.eval, p))) {
            evalues <- evalues * p / sum.eval
            if(missing(scale.)) warning("Eigenvalues were scaled to sum to p\n")
        }
    }
    ## Construct eigenvectors and correlation matrix for evectors = "Givens"
    if(evectors[1] == "Givens") {
        if(random_rotation) {
            if(!is.null(seed)) set.seed(seed)
            M <- matrix(rnorm(p ^ 2), p, p)
            evec <- qr.Q(qr(M))
        } else {
            evec <- diag(p)
        }
        A <- S_fromUL(evec, evalues)
        a <- diag(A)
        while(min(a) < 1 && max(a) > 1) {
            i <- sample_i(a)
            j <- sample_j(a)
            R <- get_R(a[i], a[j], A[i, j])
            # A[, c(i, j)] <- crossprod(A[c(i, j), ], R)
            # A[c(i, j), ] <- crossprod(R, A[c(i, j), ])
            A[c(i, j), ] <- tcrossprod(R, A[, c(i, j)])
            A[, c(i, j)] <- tcrossprod(A[, c(i, j)], R)
            a[i] <- A[i, i] <- 1
            a[j] <- A[j, j]
        }
        diag(A) <- 1
    } else {
        ## When evectors = "plain", eigenvector matrix is the identity matrix
        ## Otherwise (evectors = "random" or "MAP"), a random orthogonal matrix
        ## is used as (initial) eigenvector matrix
        if(evectors[1] == "plain") {
            evec <- diag(p)
        } else if(is.character(evectors)) {
            if(!is.null(seed)) set.seed(seed)
            M <- matrix(rnorm(p ^ 2), nrow = p, ncol = p)
            evec <- qr.Q(qr(M))
            ## When evectors = "MAP", do Waller's alternating projections
            if(evectors == "MAP") {
                ##########
                ## Start of adopted block: largely adopted from
                ## fungible::rMAP with modifications
                Sstar <- S_fromUL(evec, evalues)
                i <- 0L
                while(i <= maxiter) {
                    Su <- Sstar
                    diag(Su) <- 1
                    svd.Su <- svd(Su, nu = 0)
                    v <- svd.Su$v
                    d <- svd.Su$d
                    if(max(abs(evalues - d)) < tol) break
                    Sstar <- S_fromUL(v, evalues)
                    Sstar <- (Sstar + t(Sstar)) / 2
                    i <- i + 1L
                }
                if(i > maxiter) {
                    warning("\nEigenvectors for correlation matrix failed to ",
                            "failed to converge after ", maxiter, " iterations")
                }
                ## End of adopted block
                ##########
                evec <- v
            }
        } else {
            ## If evectors provided are numeric
            evectors <- as.matrix(evectors)
            if(nrow(evectors) != p) {
                stop("User-specified eigenvectors should ",
                     "be a matrix with p rows")
            }
            s <- ncol(evectors)
            cpev <- crossprod(evectors)
            ltcpev <- cpev[lower.tri(cpev)]
            ## evectors forced to be orthonormal
            ## Object evec is used as a correctly constructed eigenvectors
            if(!isTRUE(all.equal(ltcpev, rep.int(0, s * (s - 1) / 2)))) {
                warning("User-specified eigenvectors were ",
                        "converted to be orthonormal")
                evec <- evectors[, 1, drop = FALSE]
                evec <- scale_unit(evec)
                for(i in 2:s) {
                    y <- evectors[, i, drop = FALSE]
                    evec <- cbind_orth(evec, y)
                }
            } else if(!isTRUE(all.equal(diag(cpev), rep.int(1, s)))) {
                warning("User-specified eigenvectors were scaled ",
                        "to be of unit length")
                evec <- scale_unit(evectors)
            } else {
                evec <- evectors
            }
            if(s < p) {
                # ## Columns are added one by one, until ncol(evec) == p
                # while(ncol(evec) < p) {
                #     y <- matrix(rnorm(p), p, 1)
                #     evec <- cbind_orth(evec, y)
                # }
                ## Orthonormal columns generated by QR decomposition
                y <- matrix(rnorm(p * (p - s)), p)
                ev.Q <- qr.Q(qr(cbind(evec, y)))
                evec <- cbind(evec, ev.Q[, (s + 1):p])
            }
        }
        ## Construct A from evec and evalues
        A <- S_fromUL(evec, evalues)
    }
    ## Ensure the final output to be symmetric, and do check and scale.
    A <- (A + t(A)) / 2
    if(check) {
        svd.A <- svd(A, nu = 0)
        if(any(svd.A$d < 0)) {
            A <- with(svd.A, S_fromUL(v, pmax(d, 0)))
        }
    }
    if(!is.null(scale.)) A <- A / p * scale.
    return(A)
}
#   There are several equivalent options for S_fromUL:
# 1 "tcrossprod(evec, tcrossprod(evec, diag(evalues)))" is fast for p < 30
# 2 "te <- t(evec); crossprod(crossprod(diag(evalues), te), te)"
#   is always slower than the current one
# 3 "tcrossprod(evec, evec * unlist(lapply(1:p,
#                                  function(i) rep.int(evalues[i], p))))"
#   is clumsier but faster than
#   "tcrossprod(evec, evec * rep(evalues, each = p))" when p > 100;
#   "tcrossprod(evec, sweep(evec, 2, evalues, "*"))" is generally slower
#   than this. This is also the fastest when p > 1000
# 4 "te <- t(evec); crossprod(te * evalues, te)",
#   which is uniformly faster than, e.g.,
#   "tcrossprod(evec, t(t(evec) * evalues))",
#   is the fastest for 30 < p < 1000; slightly slower than the immediate above
#   beyond this, but only by ~15% up to p = 8192
# 5 "evec %*% diag(evalues) %*% t(evec)" is always slow
# These computational speeds were evaluated with Intel oneAPI MKL
# With the default reference BLAS, 3 is generally the fastest,
# followed by 5, 1, and 3.