# sLED: The sparse leading eigenvalue driven (sLED) test In lingxuez/sLED: A Sparse Leading Eigenvalue Driven (sLED) Test for High-dimensional Matrices

## Description

The sLED test for two-sample high-dimensional covariance and relationship matrices. Suppose X, Y are p-dimensional random vectors independently coming from two populations. Let D be the differential matrix given by

D = A(Y) - A(X)

sLED tests the following hypothesis:

H_0: D=0 versus H_1: D != 0

where A() represents some p-by-p relationship matrix among features, including covariance matrices, correlation matrices, or the weighted adjacency matrices defined as

A_{ij} = |corr(i, j)|^b

for some constant b > 0, 1 <= i, j <= p. Let A represent the regular correlation matrix when b=0, and covariance matrix when b<0.

## Usage

 1 2 3 sLED(X, Y, adj.beta = -1, rho = 1000, sumabs.seq = 0.2, npermute = 100, useMC = FALSE, mc.cores = 1, seeds = NULL, verbose = TRUE, niter = 20, trace = FALSE)

## Arguments

 X n1-by-p matrix for samples from the first population. Rows are samples/observations, while columns are the features. Y n2-by-p matrix for samples from the second population. Rows are samples/observations, while columns are the features. adj.beta a positive number representing the power to transform correlation matrices to weighted adjacency matrices by A_{ij} = |r_ij|^adj.beta, where r_ij represents the Pearson correlation. When adj.beta=0, the correlation marix is used. When adj.beta<0, the covariance matrix is used. The default value is adj.beta=-1. rho a large positive constant such that A(X)-A(Y)+diag(rep(rho, p)) is positive definite. sumabs.seq a numeric vector specifing the sequence of sparsity parameters to use, each between 1/sqrt(p) and 1. npermute number of permutations to use, default is 100 useMC logical, whether to use multi-core version mc.cores a number indicating how many cores to use in parallelization seeds a numeric vector with the length equals to npermute, where seeds[i] specifies the seeding for the i-th permutation. Set to NULL if do not want to specify. verbose whether to print the progress during permutation tests niter the number of iterations to use in the PMD algorithm (see symmPMD()) trace logical, whether to trace the progress of PMD algorithm (see symmPMD())

## Details

For large data sets, the multi-core version is recommended: useMC=TRUE and mc.cores=n, where n is the number of cores to use.

## Value

A list containing the following components:

 Tn the test statistic Tn.perm the test statistic for permuted samples Tn.perm.sign the sign for permuted samples: "pos" if the permuted test statistic is given by sEig(D), and "neg" if is given by sEig(-D), where sEig denotes the sparse leading eigenvalue. pVal the p-value of sLED test sumabs.seq a numeric vector for a sequence of sparsity parameters. Default is 0.2. The numbers must be between 1/sqrt{p} and 1. rho a positive constant to augment the diagonal of the differential matrix D such that D + rho*I becomes positive definite. stats a numeric vector of test statistics when using different sparsity parameters (corresponding to sumabs.seq). sign a vector of signs when using different sparsity parameters (corresponding to sumabs.seq). Sign is "pos" if the test statistic is given by sEig(D), and "neg" if is given by sEig(-D), where sEig denotes the sparse leading eigenvalue. v the sequence of sparse leading eigenvectors, each row corresponds to one sparsity parameter given by sumabs.seq. leverage the leverage of genes (defined as v^2 element-wise) using different sparsity parameters. Each row corresponds to one sparsity parameter given by sumabs.seq.

## References

Zhu, Lei, Devlin and Roeder (2016), "Testing High Dimensional Covariance Matrices, with Application to Detecting Schizophrenia Risk Genes", arXiv:1606.00252.