knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(rmarkdown.html_vignette.check_title = FALSE)
$$
\DeclareMathOperator{\E}{E}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\Cov}{Cov}
\DeclareMathOperator{\tr}{tr}
\DeclareMathOperator*{\diag}{diag}
\DeclareMathOperator{\V}{\mathit{V}}
\DeclareMathOperator{\Vr}{\mathit{V}_\mathrm{rel}}
\DeclareMathOperator{\F}{{}_2\mathit{F}_1}
\newcommand{\m}[1]{\mathbf{#1}}
$$
This document is to describe the R package eigvaldisp
, which is at present available on GitHub{target="_blank"}. The package is to facilitate statistical analyses of eigenvalue dispersion indices, which are common measures of phenotypic integration (covariation) in biometrics.
library(stats) library(eigvaldisp)
Quantification of the magnitude of phenotypic integration (covariation) plays a central role in the study of complex traits. One of the most common classes of statistics used for this purpose is the eigenvalue dispersion indices [@Cheverud1983; @Wagner1984; @Pavlicev2009evdisp; @Haber2011; @Watanabe2021]. They are calculated from eigenvalues of a covariance or correlation matrix, and have either of the two general forms: $$ \V = \frac{1}{p}\sum_{i=1}^p{(\lambda_i - \bar{\lambda})^2} = \frac{1}{p}\sum_{i=1}^p{\lambda_i^2} - \frac{1}{p^2}\left( \sum_{i=1}^p{\lambda_i} \right)^2 , \ \Vr = \frac{\sum_{i=1}^p{(\lambda_i - \bar{\lambda})^2}}{p(p - 1) \bar{\lambda}^2} = \frac{1}{(p - 1)} \left( p \frac{\sum_{i=1}^p{\lambda_i}^2}{\left(\sum_{i=1}^p{\lambda_i}\right)^2} - 1 \right) , $$ where $p$ is the number of variables (traits), $\lambda_i$ is the $i$th eigenvalue of the covariance/correlation matrix under analysis, and $\bar{\lambda}$ is the average of eigenvalues ($\bar{\lambda} = \sum_{i=1}^p{\lambda_i}/p$). Here, $\V$ is the most naive form of definition and ranges from $0$ to $(p - 1)\bar{\lambda}^2$, whereas $\Vr$ is scaled with this maximum to range from $0$ to $1$.
Here, $\V$ and $\Vr$ are called the eigenvalue variance and relative eigenvalue variance, respectively. Some authors prefer constant multiples or square roots of these indices, but these essentially convey identical information so are not discussed here. Synonyms in this broad sense include the tightness [@VanValen1974; @VanValen2005], integration coefficient of variation [@ShiraiMarroig2010], and phenotypic integration index [@ToricesMunozPajares2015].
As eigenvalues of a covariance/correlation matrix represent the variance along the corresponding eigenvectors (principal components), dispersion of eigenvalues has an intuitive interpretation as a measure of eccentricity of variation. As such, these indices have been used by biometricians as an indicator of the magnitude of integration. (As seen below, however, there are more rigorous theoretical justifications which had long been known to statisticians.)
In what follows, we distinguish the population and sample quantities. The population covariance and correlation matrices are denoted by $\m{\Sigma}$ and $\m{P}$ (Greek rho), respectively, and the sample covariance and correlation matrices by $\m{S}$ and $\m{R}$. We also use $\m{\Lambda}$ and $\lambda$ for the eigenvalue matrix and individual eigenvalues for the population, and $\m{L}$ and $l$ for the sample. With this notation, we use, for example, $\V(\m{\Sigma})$, $\Vr(\m{S})$, to designate eigenvalue dispersion indices of the relevant matrices.
To get some idea, let's generate a simple population covariance matrix
with the function GenCov()
involved in this package, which constructs a covariance/correlation matrix with known eigenvalues/vectors, and then calculate
eigenvalue dispersion indices of this matrix with the function VE()
:
set.seed(30) ## Generate a population covariance matrix with known eigenvalues Lambda <- c(4, 2, 1, 1) (Sigma <- GenCov(evalues = Lambda, evectors = "random")) eigen(Sigma)$values ## Calculate eigenvalue dispersion indices of this matrix EDI_pop <- VE(S = Sigma) ## Eigenvalue variance ("V(Sigma)") EDI_pop$VE ## Relative eigenvalue variance ("Vrel(Sigma)"): EDI_pop$VR
It is trivial to calculate these eigenvalue dispersion indices from the eigenvalues with the definitions above. Then, what is the point of having a (or another) package for this?
The problem becomes evident when one comes to estimating population eigenvalue dispersion indices from a sample. It is well recognized that sample eigenvalues are always estimated with sampling bias and error [see, e.g., @Jolliffe2002]. Therefore, for example, even if all population eigenvalues are equal ($\V(\m{\Sigma}) = \Vr(\m{\Sigma}) = 0$, the sample estimates of these quantities are always (or almost surely) positive.
To see this point, let us simulate a small multivariate normal sample from
the population covariance matrix generated above (via the function rmvn()
):
## Simulate a multivariate normal sample N <- 20 X <- rmvn(N = N, Sigma = Sigma) cov(X) ## Reasonable estimate of Sigma ## Calculating eigenvalue dispersion indices from the sample EDI_sam <- VE(X = X) ## Same as VE(S = cov(X)) but usually faster ## Sample eigenvalue variance ("V(S)") EDI_sam$VE ## Sample relative eigenvalue variance ("Vrel(S)") EDI_sam$VR
Observe how these sample quantities overestimate the population values. The presence of this estimation bias hinders interpretation of sample quantities.
The main functionality of this package is to calculate the amount of estimation bias as well as error, based on the theoretical results of @Watanabe2021, who derived the first two moments (or approximation thereof) of $\V(\m{S})$, $\Vr(\m{S})$, and $\Vr(\m{R})$ under the multivariate normality (note that, for any correlation matrix, $\bar{\lambda} = \bar{l} = 1$ so that $\V(\m{R}) = (p-1)\Vr(\m{R})$, hence $\V(\m{R})$ does not require a separate treatment).
The expectation and variance of the eigenvalue dispersion indices are implemented in the Exv.*()
and Var.*()
families of functions, which take suffices like *.VES
(for $\V(\m{S})$), *.VRS
(for $\Vr(\m{S})$), *.VRR
(for $\Vr(\m{R})$) (more details below). These can be used as follows:
## Expectation of eigenvalue variance ("E[V(S)]") ## The argument n is for the degree of freedom, hence N - 1 in this case (E_V_Sigma <- Exv.VES(Sigma = Sigma, n = N - 1)) ## Expected bias E_V_Sigma - EDI_pop$VE ## Error (sampling variance) of eigenvalue variance ("Var[V(S)]") Var.VES(Sigma, N - 1) ## Same for relative eigenvalue variance ("E[Vrel(S)]", "Var[Vrel(S)]") (E_Vrel_Sigma <- Exv.VRS(Sigma, N - 1)) E_Vrel_Sigma - EDI_pop$VR Var.VRS(Sigma, N - 1)
Note how close the above sample estimators are to these theoretical expectations, given the sampling variance (perhaps too fortuitously close). These functions aim to facilitate correct appreciation of the amount of sampling bias/error and interpretation of empirical results.
Also, "bias-corrected" estimators devised by @Watanabe2021 are also implemented, although this is not globally unbiased except for $\V(\m{S})$:
## Bias-corrected eigenvalue variance VESa(X = X)$VESa ## Its expectation (equals the population value) Exv.VESa(Sigma, N - 1) ## Its variance (smaller than that of the ordinary one) Var.VESa(Sigma, N - 1) ## Adjusted relative eigenvalue variance VRSa(X = X)$VRSa ## Its expectation (underestimates the population value) Exv.VRSa(Sigma, N - 1) ## Its variance Var.VRSa(Sigma, N - 1)
The same functionalities are also available for correlation matrices,
but via different functions (Exv.VRR()
, Var.VRR()
, VRRa()
, etc.)
since their distributions are different.
VE()
: calculate eigenvalue dispersion indicesThis function calculates eigenvalue dispersion indices from either a data matrix X
, a covariance/correlation matrix V
, or a vector of eigenvalues L
. The output is a list with the following elements:
VE
: eigenvalue variance $\V$VR
: relative eigenvalue variance $\Vr$meanL
: average eigenvalue $\bar{\lambda}$ or $\bar{l}$L
: vector of eigenvalues $(\lambda_1, \dotsc, \lambda_p)$ or $(l_1, \dotsc, l_p)$This function does not distinguish population and sample (although providing X
implies it is about the sample). When X
or V
is provided, eigenvalues of the covariance matrix are extracted by default, with svd()
. Those of the correlation matrix can be obtained by setting the argument scale.
to TRUE
.
With the above example,
## Eigenvalue dispersion indices of the population covariance matrix str(VE(S = Sigma)) ## Of the population correlation matrix ## Same as VE(S = cov2cor(Sigma)) str(VE(S = Sigma, scale. = TRUE)) ## Of the sample covariance matrix str(VE(X = X)) ## Of the sample correlation matrix str(VE(X = X, scale. = TRUE)) ## The following pairs are equivalent to each other all.equal(VE(X = X), VE(S = cov(X))) all.equal(VE(X = X), VE(L = eigen(cov(X))$values)) all.equal(VE(X = X, scale. = TRUE), VE(S = cor(X)))
This function can also return user-specified number of leading eigenvectors, when the argument nv
is set to the desired number:
str(VE(X = X, nv = 2))
Optionally, the calculation of eigenvalue dispersion can be restricted to the subspace corresponding to the nonzero eigenvalues (or indeed any arbitrary subspace can be specified). See the documentation of this function for details.
VESa()
, VRSa()
, VRRa()
: "bias-corrected" dispersion indicesThese functions are similar to VE()
(and actually call that function internally), but also returns bias-corrected or adjusted versions of eigenvalue dispersion indices: bias corrected eigenvalue variance of covariance matrix $\tilde V(\m{S})$ (VESa()
), adjusted relative eigenvalue variance of covariance matrix $\bar V_\mathrm{rel}(\m{S})$ (VRSa()
), and adjusted relative eigenvalue variance of correlation matrix $\bar V_\mathrm{rel}(\m{R})$ (VRRa()
). The output is a similar list, but with the corresponding index appended: e.g., the output of VRSa()
has the element $VRSa
:
str(VESa(X = X)) str(VRSa(X = X)) str(VRRa(X = X)) ## Automatically sets scale. = TRUE
Exv.*()
and Var.*()
families: expectation and varianceThese families of functions calculate the expectation (mean) and variance of eigenvalue dispersion indices, given a population covariance/correlation matrix and degrees of freedom.
Here is a list of the functions:
Statistic | Expectation | Variance
------------------------------|-------------|------------
$\V(\m{S})$ | Exv.VES()
| Var.VES()
$\Vr(\m{S})$ | Exv.VRS()
| Var.VRS()
$\V(\m{R})$ | Exv.VER()
| Var.VER()
$\Vr(\m{R})$ | Exv.VRR()
| Var.VRR()
$\tilde{V}(\m{S})$ | Exv.VESa()
| Var.VESa()
$\bar{V}\mathrm{rel}(\m{S})$ | Exv.VRSa()
| Var.VRSa()
$\bar{V}\mathrm{rel}(\m{R})$ | Exv.VRSa()
| Var.VRSa()
Note: The functions for $\V(\m{R})$ are retained for completeness, but this statistic is almost redundant with $\Vr(\m{R})$ (above). No functions for "bias-corrected" version of this have been implemented.
These functions typically take two arguments: the population covariance or correlation matrix (Sigma
or Rho
, as appropriate), and the degrees of freedom n
. They are (pseudo-)vectorized with respect to n
, so that the output is a vector with the same length as n
.
They can take a vector of population eigenvalues as the argument Lambda
, instead of Sigma
or Rho
. For functions pertaining to $\V(\m{S})$ and $\Vr(\m{S})$, the results are identical (and indeed leads to slight improvement of speed when Lambda
is already available). On the other hand, this is usually not recommended for those pertaining to $\Vr(\m{R})$, because the distribution of this index is also dependent on population eigenvectors as well as population eigenvalues, except in certain restrictive conditions [see @Watanabe2021]. If the latter functions are provided with Lambda
, a population correlation matrix is generated with random eigenvectors, which in general leads to random fluctuation in the results (see examples below).
All these functions return exact moments under the null condition of sphericity or no correlation ($\m{\Sigma} = \sigma^2\m{I}_p$ and $\m{P} = \m{I}_p$, respectively, with $\m{I}_p$ being the identity matrix). Otherwise, they return exact (Exv.VES()
, Var.VES()
, Exv.VRR()
) or asymptotic moments (Exv.VRS()
, Var.VRS()
); the accuracy of the latter can be suboptimal when n
is small (though they are the best implementations to my knowledge). Var.VRR()
is tricky; it returns the exact variance when $p = 2$ or under the null condition, but otherwise returns an asymptotic variance with one of the functions described below.
Note that, throughout the package, the small n
is used for the degrees of freedom, whereas the capital N
is for the sample size, following the convention in some statistical literature [and @Watanabe2021]. The use of n
is to enable flexibility, as well as to make the user aware of the distinction between these quantities. Most typically, n
should be N - 1
, but other choices are not uncommon in applications---e.g., when analyzing a pooled covariance matrix across different groups, analyzing regression residuals or partial correlations.
Examples for covariance matrices (also see the examples above):
Ns <- c(10, 20, 50, 100) ## Moments for different sample sizes for the covariance matrix Exv.VES(Sigma, Ns - 1) Var.VES(Sigma, Ns - 1) EDI_pop$VE ## Population value for comparison Exv.VRS(Sigma, Ns - 1) Var.VRS(Sigma, Ns - 1) EDI_pop$VR ## Population value ## For the correlation matrix Rho <- cov2cor(Sigma) Exv.VRR(Rho, Ns - 1) Var.VRR(Rho, Ns - 1) VE(S = Rho)$VR ## Population value ## Comparing results from the covariance matrix and its eigenvalues Exv.VRS(Sigma = Sigma, n = Ns - 1) Exv.VRS(Lambda = Lambda, n = Ns - 1) ## These are equal (up to potential rounding error) ## From the correlation matrix and its eigenvalues Exv.VRR(Rho = Rho, n = Ns - 1) Exv.VRR(Lambda = eigen(Rho)$values, n = Ns - 1) ## These are NOT equal. Note the warning ## Indeed, the latter's result varies from run to run ## Profile of E[Vrel(S)] in simple conditions; p = 4 and n = 10, ## and single large (spiked) population eigenvalue xseq <- seq(0, 1, 0.01) plot(xseq, sapply(xseq, function(x) Exv.VRS(GenCov(p = 4, VR = x), n = 10)), xlim = c(0, 1), ylim = c(0, 1), xlab = "Vrel(Sigma)", ylab = "E[Vrel(S)]", type = "l", asp = 1, col = "tomato") abline(a = 0, b = 1, col = "gray", lty = 2) ## Same, adjusted index plot(xseq, sapply(xseq, function(x) Exv.VRSa(GenCov(p = 4, VR = x), n = 10)), xlim = c(0, 1), ylim = c(0, 1), xlab = "Vrel(Sigma)", ylab = "E[Vrel_adj(S)]", type = "l", asp = 1, col = "tomato") abline(a = 0, b = 1, col = "gray", lty = 2)
AVar.VRR_*()
families: asymptotic variance of $\Vr(\m{R})$There are several internal functions to calculate asymptotic variance of $\Vr(\m{R})$, which are primarily to be called by the front-end Var.VRR()
(above).
These are to accommodate two different approaches to obtain asymptotic variance of $\Vr(\m{R})$ as described by @Watanabe2021: one based on the approach of @PanFrank2004, and the other based on the theory of @Konishi1979. Empirically, the former tends to be more accurate in various conditions but requires nontrivial computational time when there are many variables ($p>200$ or so), whereas the latter tends to be slightly inaccurate but can be instantly evaluated. The two methods are usually used via Var.VRR(..., method = "Pan-Frank")
(default) and Var.VRR(..., method = "Konishi")
.
At present, the approach of @PanFrank2004 is implemented with functions AVar.VRR_pf*()
. By default, Var.VRR(..., method = "Pan-Frank")
calls AVar.VRR_pfd()
, which should be fast enough for $p$ up to ~100 or so. AVar.VRR_pf()
and AVar.VRR_pfv()
are equivalent but substantially slower or requires too much RAM space as $p$ grows. They are retained for demonstrative purposes, as they more clearly show how the algorithm is implemented. AVar.VRR_pfc()
is an interface to much faster implementation with C++ functions which are provided in a separate extension package, eigvaldispRcpp
. This enables evaluation for $p$ < 500 (say) in the order of seconds or minutes (although it still takes some time for larger matrices).
The asymptotic variance from the theory of @Konishi1979 is also implemented in several functions, of which only AVar.VRR_klv()
, called in Var.VRR(..., method = "Konishi")
by default, would be of practical use.
For completeness, non-default functions can be called by using the argument fun
in Var.VRR()
to specify the suffix of the function, e.g., Var.VRR(..., fun = "pfv")
to call AVar.VRR_pfv()
.
## Two different approaches with slightly different outcomes (in this case) Var.VRR(Rho, Ns - 1, method = "Pan-Frank") ## Calls Avar.VRR_pfd() Var.VRR(Rho, Ns - 1, method = "Konishi") ## Calls Avar.VRR_klv()
Exv.r*()
and the like: moments of the correlation coefficientMoments of $\Vr(\m{R})$ are usually calculated from those of of correlation coefficients, for which this package has >15 internal functions.
These functions typically take as arguments the degrees of freedom n
and the population correlation coefficient x
. Asymptotic functions also takes the order of asymptotic approximation order.
(which corresponds to the exponent $m$ in $O(n^m)$).
For further details see help(Exv.rx)
.
simulateVE()
: simulating eigenvalue dispersion indicesThis function is to conduct Monte Carlo simulations of eigenvalue dispersion indices with multivariate normal populations (and was used in the simulations of @Watanabe2021).
This function usually takes three arguments: b
, the number of simulations, Sigma
, the population covariance matrix, and N
, sample size.
Alternatively, parametric bootstrapping can be performed by providing a data matrix X
(in which case Sigma
and N
are not required).
It returns a list of the following elements:
VES
: eigenvalue variance of covariance matrix $\V(\m{S})$VRS
: relative eigenvalue variance of covariance matrix $\Vr(\m{S})$LS
: eigenvalues of covariance matrix $(l_1, \dotsc, l_p)$ of $\m{S}$VER
: eigenvalue variance of correlation matrix $\V(\m{R})$VRR
: relative eigenvalue variance of correlation matrix $\Vr(\m{R})$LR
: eigenvalues of correlation matrix $(l_1, \dotsc, l_p)$ of $\m{R}$Optionally, eigenvectors can be calculated and returned, at the cost of computational time and RAM.
Example with rather small b
:
simres <- simulateVE(10L, Sigma = Sigma, N = N) str(simres)
This function generates random normal variates with the function rmvn()
described below, and calculate eigenvalue dispersion indices via svd()
. The first skip could be skipped by use of stats::rWishart()
for this relatively simple task, but this extra step is retained because there is a future plan to extend the algorithm to matrix-variate normal conditions (where both columns and rows are correlated).
rmvn()
: generating multivariate normal sampleThis is a function used to generate a multivariate normal sample with known parameters.
rmvn()
is rather similar to mvtnorm::rmvnorm()
(by convergence, in biologists' words), but has been designed to work optimally for simple simulation studies. By default, rmvn()
uses Cholesky factorization chol()
to obtain a matrix square root of the covariance matrix (and this is slightly faster than mvtnorm::rmvnorm(..., method = "chol")
). Options for different algorithms are available, as this default method fails when the covariance matrix is singular (the user is assumed to know when this happens). On the other hand, mvtnorm::rmvnorm()
is robust in this sense, as its default algorithm can appropriately handle singular covariance matrices.
In addition, rmvn()
can directly take the matrix square root as the argument cSigma
, which can be useful when numerous simulations are run with the same population covariance matrix.
Quick examples:
## Full-rank covariance matrix Sigma1 <- GenCov(evalues = c(4, 2, 1, 1), evectors = "random") X1 <- rmvn(10, Sigma = Sigma1) ## Singular covariance matrix (with only 1 eigenvalue equal to zero) Sigma2 <- GenCov(evalues = c(4, 2, 1, 0), evectors = "random") ## Not run: rmvn(10, Sigma = Sigma2) ## Fails X2 <- rmvn(10, Sigma = Sigma2, sqrt_method = "chol_piv") ## Singular covariance matrix (with >1 eigenvalue equal to zero) Sigma3 <- GenCov(evalues = c(4, 2, 0, 0), evectors = "random") X3 <- rmvn(10, Sigma = Sigma3, sqrt_method = "chol_piv") ## May fail X3 <- rmvn(10, Sigma = Sigma3, sqrt_method = "matsqrt") ## Better
A brief summary of the results used in this package is given here. For full theoretical details, see @Watanabe2021.
It is seen from the definition above that $\Vr({\m{S}})$ is a linear function of the quantity $\sum_{i=1}^p{l_i}^2/\left(\sum_{i=1}^p{l_i}\right)^2 = \tr\m{S}^2/\left(\tr\m{S}\right)^2$. This quantity is known in the statistical literature [@John1971; @John1972; @Sugiura1972; @Nagao1973; @LedoitWolf2002] as the locally most powerful test statistic for the sphericity (the null hypothesis $\m{\Sigma} = \sigma^2\m{I}_p$ with arbitrary $\sigma^2$) under multivariate normality, among those classes of statistics that are invariant against translation, rotation, and scaling. Hence, it is a good measure for the deviation from the spherical covariance matrix.
On the other hand, it is possible to show $\Vr({\m{R}}) = \frac{2}{p(p-1)} \sum_{i<j}^p{r_{ij}^2}$, where $r_{ij}$ is the $(i,j)$-th element of $\m{R}$ [@GleasonStaelin1975; @DurandLeRoux2017]. That is, this is the average squared correlation, which has a straightforward interpretation as an index of integration.
These symbols are used in what follows:
The first two moments of $\V(\m{S})$ are: $$ \begin{align} \E[\V(\m{S})] = & \frac{n}{p^2n_^2}\left[ (p-n) (\tr\m{\Lambda})^2 + (pn + p - 2) \tr(\m{\Lambda}^2) \right]; \ \Var[\V(\m{S})] = & \frac{4n}{p^4n_^4}\left[ 2(p-n)^2 \tr(\m{\Lambda}^2)(\tr\m{\Lambda})^2 + (p^2n + p^2 - 4p + 2n)[\tr(\m{\Lambda}^2)]^2 \right. \ & \quad \quad + 4(p-n)(pn+p-2) \tr(\m{\Lambda}^3)\tr\m{\Lambda} \ & \quad \quad \left. + (2p^2n^2 + tp^2n + 5p^2 -12pn - 12p + 12)\tr(\m{\Lambda}^4) \right]. \end{align} $$ Under the null hypothesis of sphericity ($\m{\Sigma} = \sigma^2\m{I}p$), these reduce to: $$ \begin{align} \E\mathrm{null}[\V(\m{S})] =& \frac{n}{pn_^2}(p - 1) (p + 2)\sigma^4; \ \Var_\mathrm{null}[\V(\m{S})] =& \frac{4n}{p^3n_^4}(p - 1)(p + 2)(2p^2 + pn + 3p - 6) \sigma^8. \end{align} $$ These moments are exact (non-asymptotic).
From the definitions above, the first two moments of $\Vr(\m{S})$ are: $$ \begin{align} \E[\Vr(\m{S})] =& \frac{1}{p-1}\left(p \E\left[\frac{\sum_{i=1}^p{l_i}^2}{\left(\sum_{i=1}^p{l_i}\right)^2}\right] - 1 \right); \ \Var[\Vr(\m{S})] =& \left(\frac{p}{p-1}\right)^2 \Var\left[\frac{\sum_{i=1}^p{l_i}^2}{\left(\sum_{i=1}^p{l_i}\right)^2}\right]. \end{align} $$
It is not straightforward to obtain moments of the ratio $\sum_{i=1}^p{l_i}^2/\left(\sum_{i=1}^p{l_i}\right)^2$, but the following approximation can be reached from the delta method: $$ \begin{align} \E\left[\frac{ \sum_{i=1}^p{l_i}^2 }{ \left(\sum_{i=1}^p{l_i}\right)^2 }\right] \approx & \frac{ (\tr\m{\Lambda})^2 + (n+1)\tr(\m{\Lambda}^2) }{ n(\tr\m{\Lambda})^2 + 2\tr(\m{\Lambda}^2) } - \frac{8(n-1)(n+2) }{ n\left[ n(\tr\m{\Lambda})^2 + 2\tr(\m{\Lambda}^2)\right]^3} \ & \times \Big{ n(\tr\m{\Lambda})^3\tr(\m{\Lambda}^3) - n(\tr\m{\Lambda})^2\big[\tr(\m{\Lambda}^2)\big]^2 - \big[\tr(\m{\Lambda}^2)\big]^3 \ & \quad {} - 2\tr\m{\Lambda}\tr(\m{\Lambda}^2)\tr(\m{\Lambda}^3) + 3(\tr\m{\Lambda})^2\tr(\m{\Lambda}^4) \Big}; \ \Var\left[\frac{\sum_{i=1}^p{l_i}^2}{\left(\sum_{i=1}^p{l_i}\right)^2}\right] \approx & \frac{4(n-1)(n+2)}{n\left[ n(\tr\m{\Lambda})^2 + 2\tr(\m{\Lambda}^2)\right]^4} \ & \times \left{ n(\tr\m{\Lambda})^4\big[\tr(\m{\Lambda}^2)\big]^2 + 2n(n + 1)(\tr\m{\Lambda})^2\big[\tr(\m{\Lambda}^2)\big]^3 + 2(n + 1)\big[\tr(\m{\Lambda}^2)\big]^4 \right. \ & \quad - 4(n - 1)(n + 2)(\tr\m{\Lambda})^3\tr(\m{\Lambda}^2)\tr(\m{\Lambda}^3) + (2n^2 + 3n - 6)(\tr\m{\Lambda})^4\tr(\m{\Lambda}^4) \ & \quad \left. {} - 4n(\tr\m{\Lambda})^2\tr(\m{\Lambda}^2)\tr(\m{\Lambda}^4) - 4(\tr\m{\Lambda})^4\big[\tr(\m{\Lambda}^2)\big]^2 \right}. \end{align} $$
Inserting these moments into the above equation yields approximate moments of $\Vr(\m{S})$.
Nevertheless, under the null hypothesis of sphericity, it is possible to obtain the exact moments of the ratio $\sum_{i=1}^p{l_i}^2/\left(\sum_{i=1}^p{l_i}\right)^2$, from which the following null moments can be obtained: $$ \E_\mathrm{null}[\Vr(\m{S})] = \frac{p + 2}{pn + 2} \ \Var_\mathrm{null}[\Vr(\m{S})] = \frac{4p^2(p + 2)(n - 1)(n + 2)} {(p - 1)(pn + 2)^2(pn + 4)(pn + 6)} $$
The relevant moments can conveniently be obtained from the form of average squared correlation. The basic forms are: $$ \E[\Vr(\m{R})] = \frac{2}{p(p-1)}\sum_{i<j}^{p}{\E(r_{ij}^2)}; \ \Var[\Vr(\m{R})] = \frac{4}{p^2(p-1)^2} \left[ \sum_{i<j}^{p}{\Var(r_{ij}^2)} + \sum_{i<j, k<l, \ (i,j)<(k,l)}^{p}{2\Cov(r_{ij}^2, r_{kl}^2)} \right], $$ where the last summation runs over all non-redundant pairs of squared correlation coefficients.
The expectation is straightforward to obtain from the following: $$ \E(r_{ij}^2) = 1 - \frac{(n-1)(1-\rho_{ij}^2)}{n} \F\left(1, 1; \frac{n+2}{2}; \rho_{ij}^2 \right), \quad i \neq j, $$ where $\F$ is the hypergeometric function: $$ \F(a, b; c; z) = \sum_{k=0}^{\infty}{\frac{(a)k(b)_k}{(c)_k}\frac{z^k}{k!}}, \ (x)_k = x (x + 1) \dotsm (x + k - 1) = \frac{\Gamma(x + k)}{\Gamma(x)}. $$ We also have $$ \begin{align} \Var(r{ij}^2) =& \frac{(n - 1)(n + 1)(1 - \rho_{ij}^2)}{n} \Bigg[\F\left(1, 1; \frac{n+2}{2}; \rho_{ij}^2 \right) - \frac{n}{n+2} \F\left(1, 2; \frac{n+4}{2}; \rho_{ij}^2 \right) \ & \qquad {} - \frac{2(n - 1)(1 - \rho_{ij}^2)}{n(n+1)} \F\left(1, 1; \frac{n+2}{2}; \rho_{ij}^2 \right)^2 \Bigg]. \end{align} $$
It is seen that $\E(r_{ij}^2) = 1/n$ when $\rho_{ij} = 0$, and hence $$ \E_\mathrm{null}[\Vr(\m{R})] = \frac{1}{n}. $$
The variance is more difficult to obtain. Specifically under the null hypothesis of no correlation ($\m{P} = \m{I}p$), it can be shown that $\Cov(r{ij}^2, r_{kl}^2) = 0$, leading to the following exact variance: $$ \Var_\mathrm{null}[\Vr(\m{R})] = \frac{4(n - 1)}{p(p - 1)n^2(n + 2)}. $$
In general, however, $\Cov(r_{ij}^2, r_{kl}^2)$ can be nonzero, so this term cannot be ignored unless $p = 2$. To my knowledge, there is no exact expression for this quantity in the literature. @Watanabe2021 used the following approximation based on the approach of @PanFrank2004: $$ \Cov(r_{ij}^2, r_{kl}^2) \approx 4\E(r_{ij})\E(r_{kl})\Cov(r_{ij}, r_{kl}) + 2\left[\Cov(r_{ij}, r_{kl})\right]^2, $$ where the following exact and asymptotic moments are inserted: $$ \begin{align} \E(r_{ij}) =& \frac{2}{n} \left[ \Gamma\left(\frac{n+1}{2}\right) / \Gamma\left(\frac{n}{2}\right) \right]^2 \rho_{ij} \F\left(\frac{1}{2}, \frac{1}{2}; \frac{n +2}{2}; \rho_{ij}^2\right); \ \Cov(r_{ij}, r_{kl}) \approx & \frac{1}{n} \left[ \frac{1}{2} \rho_{ij}\rho_{kl}(\rho_{ik}^2 + \rho_{il}^2 + \rho_{jk}^2 + \rho_{jl}^2) + \rho_{ik}\rho_{jl} + \rho_{il}\rho_{jk} \right. \ & \quad \left. \phantom{\frac{1}{2}} - \rho_{ij}\rho_{ik}\rho_{il} - \rho_{ij}\rho_{jk}\rho_{jl} - \rho_{ik}\rho_{jk}\rho_{kl} - \rho_{il}\rho_{jl}\rho_{kl} \right] . \end{align} $$ (eq. 38 in @Watanabe2021 erroneously dropped $\rho_{ij}$ in the first equation.)
However, this approach requires evaluation of $\sim p^4 / 4$ pairs of correlation coefficients, and can take nontrivial computational time. For this reason, the following asymptotic variance from the theory of @Konishi1979 might be useful when $p$ is large: $$ \Var[\Vr(\m{R})] \approx \frac{8}{p^2(p-1)^2n}\sum_{\alpha, \beta = 1}^p{ \lambda_\alpha^2 \lambda_\beta^2 \left[ \delta_{ij} - (\lambda_\alpha + \lambda_\beta) \sum_{i=1}^p{\upsilon_{i\alpha}^2 \upsilon_{i\beta}^2} + \sum_{i,j=1}^p{\rho_{ij}^2 \upsilon_{i\alpha}^2 \upsilon_{i\beta}^2} \right]}. $$
From the above results, it is possible to obtain an unbiased estimator of $\V(\m{\Sigma})$ as follows: $$ \tilde{V}(\m{S}) = \frac{1}{p^2n(n - 1)(n + 2)}\left[ (pn + 2) \tr(\m{A}^2) - (p + n + 1)(\tr\m{A})^2 \right]. $$ Its variance is: $$ \begin{align} \Var[\tilde{V}(\m{S})] = & \frac{4}{p^4n(n - 1)(n + 2)} \left{ 2(n - 1)(n + 2) \tr(\m{\Lambda}^2)(\tr\m{\Lambda})^2 \right. \ & \qquad {} + (p^2n + 4p + 2n + 2) \big[\tr(\m{\Lambda}^2)\big]^2 - 4p(n - 1)(n + 2) \tr(\m{\Lambda}^3)\tr\m{\Lambda} \ & \qquad \left. {} + (2p^2n^2 + 3p^2n - 6p^2 - 4pn - 4) \tr(\m{\Lambda}^4) \right}, \end{align} $$ which reduces to $4(p - 1)(p + 2)(pn + 2) / p^3n(n - 1)(n + 2)$ under the null hypothesis.
The adjusted versions of $\Vr(\m{S})$ and $\Vr(\m{R})$ discussed here are: $$ \bar{V}\mathrm{rel}(\m{S}) = \frac{pn + 2}{p(n - 1)} \Vr(\m{S}) - \frac{p + 2}{p(n - 1)}; \ \bar{V}\mathrm{rel}(\m{R}) = \frac{n}{n - 1} \Vr(\m{R}) - \frac{1}{n - 1}. $$ Moments of these adjusted estimators can be obtained from those of the ordinary ones described above.
$\bar{V}\mathrm{rel}(\m{S})$ and $\bar{V}\mathrm{rel}(\m{R})$ are designed to be unbiased under the null hypothesis and rescaled to take the maximum of $1$. Otherwise, these estimators tend to understimate the population value.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.