meff: Estimate the Effective Number of Tests

View source: R/meff.r

meffR Documentation

Estimate the Effective Number of Tests

Description

Estimate the effective number of tests.\loadmathjax

Usage

meff(R, eigen, method, ...)

Arguments

R

a \mjeqnk \times kk * k symmetric matrix that reflects the correlation structure among the tests.

eigen

optional vector to directly supply the eigenvalues to the function (instead of computing them from the matrix given via R).

method

character string to specify the method to be used to estimate the effective number of tests (either "nyholt", "liji", "gao", or "galwey"). See ‘Details’.

...

other arguments.

Details

The function estimates the effective number of tests based on one of four different methods. All methods work by extracting the eigenvalues from the \mjseqnR matrix supplied via the R argument (or from the eigenvalues directly passed via the eigen argument). Letting \mjseqn\lambda_i denote the \mjseqnith eigenvalue of this matrix (with \mjseqni = 1, ..., k) in decreasing order, the effective number of tests (\mjseqnm) is estimated as follows.

Method by Nyholt (2004)

\mjdeqn

m = 1 + (k - 1) \left(1 - \frac\mboxVar(\lambda)k\right)m = 1 + (k - 1) (1 - Var(\lambda) / k) where \mjeqn\mboxVar(\lambda)Var(\lambda) is the observed sample variance of the \mjseqnk eigenvalues.

Method by Li & Ji (2005)

\mjdeqn

m = \sum_i = 1^k f(|\lambda_i|)m = sum_i=1^k f(|\lambda_i|) where \mjeqnf(x) = I(x \ge 1) + (x - \lfloor x \rfloor)f(x) = I(x \ge 1) + (x - floor(x)) and \mjeqn\lfloor \cdot \rfloorfloor(.) is the floor function.

Method by Gao et al. (2008)

\mjdeqn

m = \min(x) \; \mboxsuch that \; \frac\sum_i = 1^x \lambda_i\sum_i = 1^k \lambda_i > Cm = min(x) such that sum_i=1^x \lambda_(i) / sum_i=1^k \lambda_(i) > C where \mjseqnC is a pre-defined parameter which is set to 0.995 by default.

Method by Galwey (2009)

\mjdeqn

m = \frac\left(\sum_i = 1^k \sqrt\lambda_i'\right)^2\sum_i = 1^k \lambda_i'm = (sum_i=1^k \sqrt\lambda_i')^2 / \sum_i=1^k \lambda_i' where \mjeqn\lambda_i' = \max[0, \lambda_i]\lambda_i' = max[0, \lambda_i].

Note: For all methods that can yield a non-integer estimate (all but the method by Gao et al., 2008), the resulting estimate \mjseqnm is rounded down to the nearest integer.

Specifying the R Matrix

The \mjseqnR matrix should reflect the dependence structure among the tests. There is no general solution on how such a matrix should be constructed, as this depends on the type of test and the sidedness of these tests. For example, we can use the correlations among related but changing elements across the analyses/tests, or a function thereof, as a proxy for the dependence structure. For example, when conducting \mjseqnk analyses with the same dependent variable and \mjseqnk different independent variables, the correlations among the independent variables could serve as such a proxy. Analogously, if analyses are conducted for \mjseqnk dependent variables with the same set of independent variables, the correlations among the dependent variables could be used instead.

If the tests of interest have test statistics that can be assumed to follow a multivariate normal distribution and a matrix is available that reflects the correlations among the test statistics (which might be approximated by the correlations among the interchanging independent or dependent variables), then the mvnconv function can be used to convert this correlation matrix into the correlations among the (one- or two-sided) \mjseqnp-values, which in turn can then be passed to the R argument. See ‘Examples’.

Not Positive Semi-Definite R

Depending on the way \mjseqnR was constructed, it may happen that this matrix is not positive semi-definite, leading to negative eigenvalues. The methods given above can all still be carried out in this case. However, another possibility is to handle such a case by using an algorithm that finds the nearest positive (semi-)definite matrix (e.g., Higham 2002) before passing this matrix to the function (see nearPD from the Matrix package for a corresponding implementation).

Value

A scalar giving the estimate of the effective number of tests.

Note

For method = "gao", C = 0.995 by default, but a different value of C can be passed to the function via ... (e.g., meff(R, method = "gao", C = 0.95)).

Author(s)

Ozan Cinar ozancinar86@gmail.com
Wolfgang Viechtbauer wvb@wvbauer.com

References

Cinar, O. & Viechtbauer, W. (2022). The poolr package for combining independent and dependent p values. Journal of Statistical Software, 101(1), 1–42. ⁠https://doi.org/10.18637/jss.v101.i01⁠

Gao, X., Starmer, J., & Martin, E. R. (2008). A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genetic Epidemiology, 32(4), 361–369.

Galwey, N. W. (2009). A new measure of the effective number of tests, a practical tool for comparing families of non-independent significance tests. Genetic Epidemiology, 33(7), 559–568.

Higham, N. J. (2002). Computing the nearest correlation matrix: A problem from finance. IMA Journal of Numerical Analysis, 22(3), 329–343.

Li, J., & Ji, L. (2005). Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity, 95(3), 221–227.

Nyholt, D. R. (2004). A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. American Journal of Human Genetics, 74(4), 765–769.

Examples

# copy LD correlation matrix into r (see help(grid2ip) for details on these data)
r <- grid2ip.ld

# estimate the effective number of tests based on the LD correlation matrix
meff(r, method = "nyholt")
meff(r, method = "liji")
meff(r, method = "gao")
meff(r, method = "galwey")

# use mvnconv() to convert the LD correlation matrix into a matrix with the
# correlations among the (two-sided) p-values assuming that the test
# statistics follow a multivariate normal distribution with correlation
# matrix r (note: 'side = 2' by default in mvnconv())
mvnconv(r, target = "p", cov2cor = TRUE)[1:5,1:5] # show only rows/columns 1-5

# use this matrix instead for estimating the effective number of tests
meff(mvnconv(r, target = "p", cov2cor = TRUE), method = "nyholt")
meff(mvnconv(r, target = "p", cov2cor = TRUE), method = "liji")
meff(mvnconv(r, target = "p", cov2cor = TRUE), method = "gao")
meff(mvnconv(r, target = "p", cov2cor = TRUE), method = "galwey")

ozancinar/poolR documentation built on Oct. 1, 2024, 12:28 a.m.