VE: Calculate eigenvalue dispersion indices
In watanabe-j/eigvaldisp: Statistics of Eigenvalue Dispersion Indices

View source: R/functions_VE.R

VE	R Documentation

Calculate eigenvalue dispersion indices

Description

Function to calculate eigenvalue variance V and relative eigenvalue variance V_\mathrm{rel} of a covariance/correlation matrix, either from a data matrix X or a covariance/correlation matrix S

Usage

VE(
  X,
  S,
  L,
  center = TRUE,
  scale. = FALSE,
  divisor = c("UB", "ML"),
  m = switch(divisor, UB = N - 1, ML = N),
  nv = 0,
  sub = seq_len(length(L)),
  drop_0 = FALSE,
  tol = .Machine$double.eps * 100,
  check = TRUE
)

Arguments

`X`	Data matrix from which covariance/corrrelation matrix is obtained
`S`	Covariance/correlation matrix
`L`	Vector of eigenvalues
`center`	Logical to specify whether sample-mean-centering should be done
`scale.`	Logical to specify whether SD-scaling should be done (that is, when `TRUE`, the analysis is on the correlation matrix). When `S` is provided (but `X` is not), this is converted to a correlation matrix by `stats::cov2cor()`.
`divisor`	Either `"UB"` (default) or `"ML"`, to decide the default value of `m`
`m`	Divisor for the sample covariance matrix (`n_*` in Watanabe (2022))
`nv`	Numeric to specify how many eigenvectors are to be retained; default `0`
`sub`	Numeric/integer vector to specify the range of eigenvalue indices to be involved; used to exclude some subspace
`drop_0`	Logical, when `TRUE`, eigenvalues smaller than `tol` are dropped
`tol`	Tolerance to be used with `drop_0`
`check`	Logical to specify whether structures of `X`, `S`, and `L` are checked (see “Details”)

Details

Provide one of a data matrix (X), covariance/correlation matrix (S), or vector of eigenvalues (L). These arguments take precedence over one another in this order; if more than one of them are provided, the one with less precedence is ignored with warning.

X must be a 2D numeric matrix, with rows representing observations and columns variables. When relevant, S must be a valid covariance/correlation matrix. With check = TRUE (default), basic structure checks are done on X, S, or L:

If X or S (when relevant) is a non-2D matrix/array, an error is returned.
If S (when relevant) is not symmetric, an error is returned.
If L (when relevant) is not a vector-like array, a warning emerges, although a result is returned (a column/row vector is tolerated without warning).
If X is a symmetric matrix (i.e., it looks like S), a warning emerges, although a result is returned.

For sake of speed, the checks can be turned off with check = FALSE.

When X is given, the default divisor is N - 1 where N is the sample size.

Sometimes it might be desirable to evaluate eigenvalue dispersion in a selected subspace, rather than in the full space. For this, provide the argument sub to restrict calculations to the subspace corresponding to the specified eigenvalues/vectors. Alternatively, set drop_0 = TRUE to drop zero eigenvalues from calculation. The former way would be more useful when the subspace of interest is known a priori. The latter is ad hoc, automatically dropping zero eigenvalues whose magnitudes are below the specified tolerance.

Value

A list containing the following numeric objects:

$VE: Eigenvalue variance (V): sum( (L - meanL)^2 ) / length(L)
$VR: Relative eigenvalue variance (V_\mathrm{rel}): VE / ( (length(L) - 1) * meanL^2)
$meanL: Mean (average) of the eigenvalues L
$L: Vector of eigenvalues
$U: Matrix of eigenvectors, appended only when nv > 0

References

Cheverud, J. M., Rutledge, J. J. and Atchley, W. R. (1983) Quantitative genetics of development: genetic correlations among age-specific trait values and the evolution of ontogeny. Evolution 37, 5–42. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1558-5646.1983.tb05619.x")}.

Haber, A. (2011) A comparative analysis of integration indices. Evolutionary Biology 38, 476–488. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s11692-011-9137-4")}.

Pavlicev, M., Cheverud, J. M. and Wagner, G. P. (2009) Measuring morphological integration using eigenvalue variance. Evolutionary Biology 36, 157–170. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s11692-008-9042-7")}.

Van Valen, L. (1974) Multivariate structural statistics in natural history. Journal of Theoretical Biology 45, 235–247. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/0022-5193(74)90053-8")}.

Wagner, G. P. (1984) On the eigenvalue distribution of genetic and phenotypic dispersion matrices: evidence for a nonrandom organization of quantitative character variation. Journal of Mathematical Biology 7, 77–95. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/BF00275224")}.

Watanabe, J. (2022) Statistics of eigenvalue dispersion indices: quantifying the magnitude of phenotypic integration. Evolution, 76, 4–28. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/evo.14382")}.

Examples

# For a population covariance matrix or population eigenvalues
set.seed(6835)
Lambda <- c(4, 2, 1, 1)
(Sigma <- GenCov(evalues = Lambda, evectors = "random"))
VE(L = Lambda)
VE(S = Sigma) # Same

# For a random sample, sample covariance matrix or its eigenvalues
N <- 20
X <- rmvn(N = N, Sigma = Sigma)
S <- cov(X)
L <- eigen(S)$values
VE(X = X)
VE(S = S) # Same
VE(L = L) # Same
# Thus, providing X is usually the most straightforward for a sample
# (and this is usally quicker for p > 30-50 or so).
# Also, observe bias in these quantities compared to population values.

# Same for maximum likelihood estimator (divisor m = N)
VE(X = X, divisor = "ML")
VE(X = X, m = N)        # Same, but any divisor can be specified
VE(S = S * (N - 1) / N) # Same
# L and meanL are (N - 1) / N times the above,
# VE is ((N - 1) / N) ^ 2 times the above, whereas VR remains the same.

# For a sample correlation matrix
R <- cor(X)
VE(S = R)
VE(X = X, scale. = TRUE) # Same, hence is usually quicker when p is large

# Interested in eigenvectors?
VE(X = X, nv = 2)

# Singular covariance matrix
Lambda2 <- c(4, 2, 1, 0)
(Sigma2 <- GenCov(evalues = Lambda2, evectors = "random"))
VE(S = Sigma2)                # Calculated in the full space
VE(S = Sigma2, sub = 1:3)     # In the subspace of the first 3 PCs
VE(S = Sigma2, drop_0 = TRUE) # Dropping zero eigenvalues (same in this case)

# Sample from singular covariance
X2 <- rmvn(N = N, Sigma = Sigma2, sqrt_method = "pivot")
VE(X = X2)                    # In the full space
VE(X = X2, sub = 1:3)         # In the subspace of the first 3 PCs
VE(X = X2, drop_0 = TRUE)     # Practically the same

# Just to note, the null space is identical between the population and sample
# in this case (where N - 1 > p)
eigen(Sigma2)$vectors[, 4]
eigen(cov(X2))$vectors[, 4]

# This is of course not the case when N - 1 < p, although
# a sample null space always encompasses the population null space.
Lambda3 <- 9:0
Sigma3 <- GenCov(evalues = Lambda3, evectors = "random")
X3 <- rmvn(N = 6, Sigma = Sigma3, sqrt_method = "pivot")
eigS3 <- eigen(cov(X3))
(Popul_null <- eigen(Sigma3)$vectors[, Lambda3 < 1e-12])
(Sample_null <-eigS3$vectors[, eigS3$values < 1e-12])
crossprod(Popul_null, Sample_null)
# None of these vectors are identical, but
tcrossprod(crossprod(Popul_null, Sample_null))
# sum of squared cosines equals 1, as expected

watanabe-j/eigvaldisp documentation built on Dec. 8, 2023, 4:38 a.m.