AVar.VRR_xx | R Documentation |
Functions to obtain approximate variance of relative eigenvalue variance
of correlation matrix
\mathrm{Var}[V_{\mathrm{rel}}(\mathbf{R})]
. There are
several versions for each of two different expressions:
pf*
and k*
families.
AVar.VRR_pf()
: asymptotic and approximate variance of
V_{\mathrm{rel}}(\mathbf{R})
based on Pan and Frank's (2004) approach. Prototype version.
AVar.VRR_pfv()
: vectorized version of AVar.VRR_pf()
; much
faster, but requires a large RAM space as p
grows
AVar.VRR_pfd()
: further improvement over AVar.VRR_pfv()
AVar.VRR_pfc()
: fast version using Rcpp; requires
the extension package eigvaldispRcpp
AVar.VRR_kl()
: asymptotic variance from Konishi's theory:
V_{\mathrm{rel}}(\mathbf{R})
as function of eigenvalues
AVar.VRR_klv()
: vectorized version of AVar.VRR_kl()
AVar.VRR_kr()
: asymptotic variance from Konishi's theory:
V_{\mathrm{rel}}(\mathbf{R})
as function of correlation
coefficients
AVar.VRR_krv()
: vectorized version of AVar.VRR_kr()
AVar.VRR_pf(
Rho,
n = 100,
Lambda,
exv1.mode = c("exact", "asymptotic"),
var1.mode = "asymptotic",
var2.mode = c("exact", "asymptotic"),
order.exv1 = 2,
order.var2 = 2,
mode = c("for.ind", "nested.for", "lapply"),
...
)
AVar.VRR_pfv(
Rho,
n = 100,
Lambda,
exv1.mode = c("exact", "asymptotic"),
var1.mode = "asymptotic",
var2.mode = c("exact", "asymptotic"),
order.exv1 = 2,
order.var2 = 2,
...
)
AVar.VRR_pfd(
Rho,
n = 100,
Lambda,
exv1.mode = c("exact", "asymptotic"),
var1.mode = "asymptotic",
var2.mode = c("exact", "asymptotic"),
order.exv1 = 2,
order.var2 = 2,
mode = c("for.ind", "lapply", "mclapply", "parLapply"),
mc.cores = "auto",
max.cores = parallel::detectCores(),
do.mcaffinity = TRUE,
affinity_mc = seq_len(max.cores),
cl = NULL,
max.size = 2e+06,
verbose = c("no", "yes", "inline"),
...
)
AVar.VRR_pfc(
Rho,
n = 100,
Lambda,
cppfun = "Cov_r2C",
nthreads = 0L,
exv1.mode = c("exact", "asymptotic"),
var2.mode = c("exact", "asymptotic"),
order.exv1 = 2,
order.var2 = 2,
...
)
AVar.VRR_kl(Rho, n = 100, Lambda, ...)
AVar.VRR_klv(Rho, n = 100, Lambda, ...)
AVar.VRR_kr(Rho, n = 100, Lambda, ...)
AVar.VRR_krv(Rho, n = 100, Lambda, ...)
Rho |
Population correlation matrix; assumed to be validly constructed (although simple checks are done). |
n |
Degrees of freedom (not sample sizes); numeric of length 1 or more. |
Lambda |
Numeric vector of population eigenvalues. |
exv1.mode |
Whether |
var1.mode |
Whether |
var2.mode |
Whether |
order.exv1, order.var2 |
Used to specify the order of evaluation for asymptotic expressions of
|
mode |
In |
... |
In the |
mc.cores |
Number of cores to be used (numeric/integer). When |
max.cores |
Maximum number of cores to be used. (Used only when
|
do.mcaffinity |
Whether to run |
affinity_mc |
Argument of |
cl |
A cluster object (made by |
max.size |
Maximum size of vectors created internally (see Details). |
verbose |
When |
cppfun |
Option to specify the C++ function to be used (see Details). |
nthreads |
Integer to specify the number of threads used in OpenMP
parallelization in |
Watanabe (2022) presented two approaches to evaluate approximate variance
of the relative eigenvalue variance of a correlation matrix
V_{\mathrm{rel}}(\mathbf{R})
. One is Pan and Frank's (2004)
heuristic approximation (eqs. 28 and 36–38 in
Watanabe 2022). The other is based on Konishi's (1979) asymptotic
theory (eq. 39 in Watanabe 2022). Simulations showed that the former tends
to be more accurate, but the latter is much faster. This is mainly because
the Pan–Frank approach involves evaluation of covariances in ~p^4 / 4
pairs of (squared) correlation coefficients. (That said, the speed
will not be a practical concern unless p
exceeds a few hundreds.)
The Pan–Frank approach is at present implemented in several functions which yield (almost) identical results:
AVar.VRR_pf()
Prototype version. Simplest implementation.
AVar.VRR_pfv()
Vectorized version. Much faster, but
requires a large RAM space as p
grows.
AVar.VRR_pfd()
Improvement over
AVar.VRR_pfv()
. Faster and more RAM-efficient. This is
the default to be called in Var.VRR(..., method = "Pan-Frank")
,
unless the extension package eigvaldispRcpp is installed.
AVar.VRR_pfc()
Fast version using Rcpp. Requires the extension package eigvaldispRcpp; this is the default when this package is installed (and detected).
AVar.VRR_pfc()
implements the same algorithm as the others,
but makes use of C++ API via the package Rcpp for
evaluation of the sum of covariance across pairs of squared correlation
coefficients. This version works much faster than vectorized R
implementations. Note that the output can slightly differ from those of
pure R implementations (by the order of ~1e-9
).
The Konishi approach is implemented in several functions:
AVar.VRR_kl()
From Konishi (1979: corollary 2.2):
V_{\mathrm{rel}}(\mathbf{R})
as function of
eigenvalues. Prototype version.
AVar.VRR_klv()
Vectorized version of
AVar.VRR_kl()
. This is the default when
Var.VRR(..., method = "Konishi")
.
AVar.VRR_kr()
From Konishi (1979: theorem 6.2):
V_{\mathrm{rel}}(\mathbf{R})
as function of
correlation coefficients.
AVar.VRR_krv()
Vectorized version of AVar.VRR_kr()
;
slightly faster for moderate p
, but not particularly fast
for large p
as the number of elements to be summed becomes large.
Empirically, these all yield the same result, but
AVar.VRR_klv()
is by far the fastest.
The AVar.VRR_pfx()
family functions by default return exact variance
when p = 2
,
If asymptotic result is desired, use mode.var2 = "asymptotic"
.
Options for mode
in AVar.VRR_pf()
and AVar.VRR_pfd()
:
"nested.for"
Only for AVar.VRR_pf()
. Uses nested
for loops, which is straifhgforward and RAM efficient but slow.
"for.ind"
(default)/"lapply"
Run the iteration along
an index vector to shorten computational time,
with for
loop and lapply()
, respectively.
"mclapply"
/"parLapply"
Only for
AVar.VRR_pfd()
. Parallelize the same iteration by
forking and socketing, respectively, with the named functions in the
package parallel. Note that the former doesn't work in the
Windows environment. See vignette("parallel")
for details.
AVar.VRR_pfv()
internally generates vectors and matrices
whose lengths are about p^4 / 8
and p^4 / 4
. These take about
2p^4
bytes of RAM; this can be prohibitively large for large p
.
AVar.VRR_pfd()
divides index vectors used internally in
AVar.VRR_pfv()
into lists, along whose elements calculations are done
to save RAM space. The argument max.size
controls the maximum size
of resulting vectors; at least max.size * (2 * length(n) + 6) * 8
bytes of RAM is required for storing temporary results (and more
during computation); e.g., ~2e7
seems good for 16 GB RAM,
~4e8
for 256 GB. However, performance does not seem to improve
past 1e6
–1e7
presumably because memory allocation takes
substantial time for large objects. The iteration can be parallelized
with mode = "mclapply"
or "parLapply"
,
but be careful about RAM limitations.
AVar.VRR_pfc()
provides a faster implementation with one of the
C++ functions defined in the extension package
eigvaldispRcpp (which is required to run this function). The
C++ function is specified by the argument cppfun
:
"Cov_r2C"
(default)Serial evaluation with base Rcpp functionalities.
"Cov_r2A"
or "Armadillo"
Using RcppArmadillo. Parallelized with OpenMP when the environment allows.
"Cov_r2E"
or "Eigen"
Using RcppEigen. Parallelized with OpenMP when the environment allows.
"Cov_r2P"
or "Parallel"
Using RcppParallel. Parallelized with IntelTBB when the environment allows.
The default option would be sufficiently fast for up to p = 100
or
so. The latter three options aim at speeding-up the calculation via
parallelization with other Rcpp-related packages. Although these
would have similar performance in most environments, "Cov_r2E"
seems
the fastest in the development environment, closely followed by
"Cov_r2A"
.
A numeric vector of \mathrm{Var}[V_{\mathrm{rel}}(\mathbf{R})]
, corresponding to n
.
Konishi, S. (1979) Asymptotic expansions for the distributions of statistics based on the sample correlation matrix in principal componenet analysis. Hiroshima Mathematical Journal 9, 647–700. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.32917/hmj/1206134750")}.
Pan, W. and Frank, K. A. (2004) An approximation to the distribution of the product of two dependent correlation coefficients. Journal of Statistical Computation and Simulation 74, 419–443. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00949650310001596822")}.
Watanabe, J. (2022) Statistics of eigenvalue dispersion indices: quantifying the magnitude of phenotypic integration. Evolution, 76, 4–28. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/evo.14382")}.
Exv.VXX
for main moment functions
# See also examples of Exv.VXX
# Correlation matrix
N <- 20
Lambda <- c(4, 2, 1, 1)
(Rho <- GenCov(evalues = Lambda / sum(Lambda) * 4, evectors = "Givens"))
# Different choices for asymptotic variance of Vrel(R)
# Variance from Pan-Frank method
Var.VRR(Rho, n = N - 1) # By default, method = "Pan-Frank" and AVar.VRR_pfd() is called
eigvaldisp:::AVar.VRR_pfd(Rho, n = N - 1) # Same as above
Var.VRR(Rho, n = N - 1, fun = "pf") # Calls AVar.VRR_pf(), which is slow for large p
Var.VRR(Rho, n = N - 1, fun = "pfv") # Calls AVar.VRR_pfv(), which requires much RAM for large p
# Various implementations with Rcpp (require eigvaldispRcpp):
## Not run: Var.VRR(Rho, n = N - 1, fun = "pfc") # By default, cppfun = "Cov_r2C" is used
## Not run: Var.VRR(Rho, n = N - 1, cppfun = "Cov_r2A")
## Not run: Var.VRR(Rho, n = N - 1, cppfun = "Cov_r2E")
## Not run: Var.VRR(Rho, n = N - 1, cppfun = "Cov_r2P")
# When the argument cppfun is provided, fun need not be specified
# The above results are identical
# Variance from Konishi's theory
Var.VRR(Rho, n = N - 1, method = "Konishi") # By default for this method, AVar.VRR_klv() is called
eigvaldisp:::AVar.VRR_klv(Rho, n = N - 1) # Same as above
Var.VRR(Rho, n = N - 1, fun = "kl")
Var.VRR(Rho, n = N - 1, fun = "kr")
Var.VRR(Rho, n = N - 1, fun = "krv")
# The results are identical, but the last three are slower
# On the other hand, these differ from that obtained with the Pan-Frank method
# Example with p = 2
Rho2 <- GenCov(evalues = c(1.5, 0.5), evectors = "Givens")
Var.VRR(Rho2, n = N - 1) # When p = 2, this does not call AVar.VRR_pfd()
eigvaldisp:::AVar.VRR_pfd(Rho2, n = N - 1)
# By default, the above returns the same, exact result
eigvaldisp:::AVar.VRR_pfd(Rho2, n = N - 1, var2.mode = "asymptotic")
eigvaldisp:::AVar.VRR_klv(Rho2, n = N - 1)
# These return different asymptotic expressions
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.