GenCov: Generate covariance matrix of known structure

View source: R/GenCov.R

GenCovR Documentation

Generate covariance matrix of known structure

Description

This function generates a (population) covariance/correlation matrix with known eigenstructure, which is specified by the number of variables p, shape, and (when applicable) relative eigenvalue variance VR. Eigenvector matrix can be the identity matrix, a random orthonormal matrix, or constructed to yield a correlation matrix (when evectors = "MAP" or "Givens"). Alternatively, eigenvalues evalues and eigenvectors evectors can directly be specified.

Usage

GenCov(
  p = length(evalues),
  VR = 0.5,
  scale. = NULL,
  evalues = "auto",
  evectors = "plain",
  shape = c("q-large", "elongate", "exponentially_decreasing", "linearly_decreasing",
    "quadratically_decreasing"),
  q = 1,
  r = 0.5,
  random_rotation = TRUE,
  which_rotate = c("random", "head", "tail", "minmax"),
  tol = 1e-12,
  maxiter = 5000L,
  seed = NULL,
  check = any(evalues == 0)
)

Arguments

p

Number of variables.

VR

The relative eigenvalue variance V_\mathrm{rel} of the generated matrix; used only when shape = "q-large" or "elongated".

scale.

When provided, the outcome is proportionally scaled to have the trace equal to this value.

evalues

Numeric vector of eigenvalues. If missing, eigenvalues are automatically generated according to p, VR, shape, and q.

evectors

Option to specify how eigenvectors are generated:

"plain"

Eigenvectors are taken as the identity matrix of order p.

"random"

Random eigenvectors are generated by QR decomposition of a square matrix of standard normal variates; i.e., sampling from the Haar invariant distribution (ignoring the signs of diagonals).

"MAP"

Eigenvectors are constructed so that the resultant matrix is a correlation matrix with the specified eigenvalues (scaled so that the trace equals p); based on the iterative algorithm of Waller (2020). Modified from fungible::rMAP().

"Givens"

Similar to "MAP", but with an algorithm adopted from Davies and Higham (2000) with modifications; this is much faster and is guaranteed to converge.

Alternatively, a matrix of eigenvectors can be provided to specify full or partial eigenvectors for the outcome matrix. These are converted to be orthonormal; randomly generated vectors are appended as necessary. Remember the ambiguity of eigenvectors' polarities (sign-swapping).

shape

Option to specify the “shape” of the covariance matrix; ignored when evalues is provided. Options allowed are:

"q-large"

The q leading eigenvalues equally large, the rest p - q equally small

"elongate"

Special case of "q-large" with q = 1

"exponentially_decreasing"

Eigenvalues exponentially decreasing either at given rate r (default) or to yield desired VR

"linearly_decreasing"

Eigenvalues linearly decreasing in magnitude

"quadratically_decreasing"

Eigenvalues quadratically decreasing

User-specified VR is ignored in the last two options. See Details.

q

The number of large eigenvalues. To be used with shape = "q-large".

r

The rate of exponential decreasing of eigenvalues; (i + 1)-th eigenvalue is r times ith eigenvalue. To be used with shape = "exponentially_decreasing". Ignored when VR is provided.

random_rotation

Only relevant when evectors = "Givens"; logical to specify whether the initial diagonal matrix of eigenvalues is randomly rotated or not. For reproducibility in simulation studies.

which_rotate

When evectors = "Givens", specify which (or in what order) columns/rows are subject to Givens rotation:

"random"

Columns/rows are chosen in random order. This conforms with the original algorithm of Davies and Higham (2000).

"head"/"tail"

The first/last ones among candidates are rotated. For reproducibility in simulation studies.

"minmax"

Columns/rows corresponding to a minimum and maximum of the diagonal elements are rotated. This is slightly faster but presumably produces strong, isolated correlations.

tol

Only relevant when evectors = "MAP", or shape = "exponentially_decreasing" and VR is provided; numeric to specify the tolerance against which convergence of eigenvectors or VR, respectively, is evaluated.

maxiter

Only relevant when evectors = "MAP"; maximum number of iterations.

seed

Random seed passed to set.seed(seed) before random eigenvectors are generated.

check

Logical, whether to check and ensure all eigenvalues of the resultant matrix are nonnegative. By default, automatically turned on when any of the specified/generated eigenvalues are equal to 0.

Details

Either p or evalues should be specified. A straightforward use is to provide a numeric vector of eigenvalues with the argument evalues. If this argument is missing, then eigenvalues are generated according to p, VR, shape, and q. When shape = "q-large", the first q eigenvalues are set equally large, with the rest p - q equally small. Specifically, the larger ones are 1 + \sqrt{V_{\mathrm{rel}}} \cdot \sqrt{(p - 1)(p - q) / q} and the smaller ones are 1 - \sqrt{V_{\mathrm{rel}}} \cdot \sqrt{q (p - 1) / (p - q)}. The combination of p, q, and VR should be such that all resultant eigenvalues are nonnegative; an error is returned otherwise. Alternatively, gradually decreasing eigenvalues can be generated with shape = "exponentially_decreasing", "linearly_decreasing", or "quadratically_decreasing". For the exponential decrease (e.g., Kirkpatrick 2009), eigenvalues decrease in magnitude at a constant rate r, which is either user-specified (default 0.5) or numerically optimized to yield desired VR when provided. For the other options, see Watanabe (2022) for details.

Unless evalues is specified, the trace of the resultant matrix equals to p by default. The argument scale. can be used to force scale the overall matrix, so that the trace equals this value. This scaling is applied just before the outcome is returned, so will not preserve absolute magnitudes of evalues even when this is specified (relative magnitudes are preserved).

A correlation matrix with known eigenvalues can be generated by choosing evectors = "MAP" or "Givens". These will construct an appropriate eigenvector matrix so that the outcome is a correlation matrix (Davies and Higham, 2000; Waller, 2020). User-specified eigenvalues are scaled to sum to p in these cases. Note that, when scale. is specified as well, the outcome will be a matrix proportional to a correlation matrix (i.e., not a correlation matrix unless scale. == p).

The original algorithm of Davies and Higham (2020) involves a random choice between two possible rotation angles, but this randomness is omitted (the angle with the larger tangent is always chosen) in the present implementation in favor of speed and reproducibility. As such, the present implementation of "Givens" slightly deviates from the original algorithm.

Copyright notice

The "MAP" algorithm originally derived from rMAP() of the package fungible version 1.99 (Waller, 2021), which is under GPL (>= 2). (Strictly speaking, this was first adopted from the supplemental material of Waller [2020], which is licensed under CC-BY-4.0.) The adopted block is indicated in the source code. The "Givens" algorithm used to be adopted from fungible::rGivens() in an earlier version of this package, but has essentially been rewritten by the present author based on Davies and Higham (2000) with modifications.

Value

A p * p numeric matrix with the specified eigenstructure

References

Davies, P. I. and Higham, N. J. (2000) Numerically stable generation of correlation matrices and their factors. BIT Numerical Mathematics 40, 640–651. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1023/A:1022384216930")}.

Kirkpatrick, M. (2009) Patterns of quantitative genetic variation in multiple dimensions. Genetica, 136, 271–284. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s10709-008-9302-6")}.

Waller, N. G. (2020) Generating correlation matrices with specified eigenvalues using the method of alternating projections. American Statistician 74, 21–28. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00031305.2017.1401960")}.

Waller, N. G. (2021) fungible: psychometric functions from the Waller Lab. R package version 1.99. https://CRAN.R-project.org/package=fungible.

Watanabe, J. (2022) Statistics of eigenvalue dispersion indices: quantifying the magnitude of phenotypic integration. Evolution, 76, 4–28. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/evo.14382")}.

Examples

# Covariance matrix with no covariance (identity matrix)
GenCov(4, 0)

# Relative eigenvalue variance = 0.25, with 1 and 2 large eigenvalues
GenCov(4, 0.5 ^ 2)
GenCov(4, 0.5 ^ 2, q = 2)

# Relative eigenvalue variance = 0.25, with 1 and 2 large eigenvalues
GenCov(evalues = c(3, 2, 1, 1))

# Exponentially decreasing eigenvalues with rate r (default 0.5)
GenCov(4, shape = "exponentially_decreasing", r = 0.5)

# Exponentially decreasing eigenvalues with desired VR; r is ignored
GenCov(4, VR = 0.5 ^ 2, shape = "exponentially_decreasing")

# Linearly decreasing eigenvalues; VR, q, and evalues are ignored
GenCov(4, shape = "linearly_decreasing")

# With random orthonormal eigenvector matrix
GenCov(4, 0.5 ^ 2, evectors = "random")

# With user-specified (partial) eigenvector matrix
GenCov(4, 0.5 ^ 2, evectors = rep.int(1 / sqrt(4), 4))

# Correlation matrix with two different algorithms
GenCov(4, 0.5 ^ 2, q = 2, evectors = "MAP")
GenCov(4, 0.5 ^ 2, q = 2, evectors = "Givens")

# Note that, by default, results of the above algorithms usually involve
# random fluctuation. To eliminate this, use either:
GenCov(4, 0.5 ^ 2, q = 2, evectors = "Givens", random_rotation = FALSE)
                              # Optionally with which_rotate = "head"/"tail"
GenCov(4, 0.5 ^ 2, q = 2, evectors = "Givens", seed = 1234)
                              # Arbitrary random seed
# The first way is not available for "MAP", but the second one is.


watanabe-j/eigvaldisp documentation built on Dec. 8, 2023, 4:38 a.m.