# apval_Chen2010: Asymptotics-Based p-value of the Test Proposed by Chen and... In highmean: Two-Sample Tests for High-Dimensional Mean Vectors

## Description

Calculates p-value of the test for testing equality of two-sample high-dimensional mean vectors proposed by Chen and Qin (2010) based on the asymptotic distribution of the test statistic.

## Usage

 1 apval_Chen2010(sam1, sam2, eq.cov = TRUE) 

## Arguments

 sam1 an n1 by p matrix from sample population 1. Each row represents a p-dimensional sample. sam2 an n2 by p matrix from sample population 2. Each row represents a p-dimensional sample. eq.cov a logical value. The default is TRUE, indicating that the two sample populations have same covariance; otherwise, the covariances are assumed to be different.

## Details

Suppose that the two groups of p-dimensional independent and identically distributed samples \{X_{1i}\}_{i=1}^{n_1} and \{X_{2j}\}_{j=1}^{n_2} are observed; we consider high-dimensional data with p \gg n := n_1 + n_2 - 2. The primary object is to test H_{0}: μ_1 = μ_2 versus H_{A}: μ_1 \neq μ_2. Let \bar{X}_{k} be the sample mean for group k = 1, 2.

Chen and Qin (2010) proposed the following test statistic:

T_{CQ} = \frac{∑_{i \neq j}^{n_1} X_{1i}^T X_{1j}}{n_1 (n_1 - 1)} + \frac{∑_{i \neq j}^{n_2} X_{2i}^T X_{2j}}{n_2 (n_2 - 1)} - 2 \frac{∑_{i = 1}^{n_1} ∑_{j = 1}^{n_2} X_{1i}^T X_{2j}}{n_1 n_2},

and its asymptotic distribution is normal under the null hypothesis.

## Value

A list including the following elements:

 sam.info the basic information about the two groups of samples, including the samples sizes and dimension. cov.assumption the equality assumption on the covariances of the two sample populations; this was specified by the argument eq.cov. method this output reminds users that the p-values are obtained using the asymptotic distributions of test statistics. pval the p-value of the test proposed by Chen and Qin (2010).

## References

Chen SX and Qin YL (2010). "A two-sample test for high-dimensional data with applications to gene-set testing." The Annals of Statistics, 38(2), 808–835.

epval_Chen2010
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 library(MASS) set.seed(1234) n1 <- n2 <- 50 p <- 200 mu1 <- rep(0, p) mu2 <- mu1 mu2[1:10] <- 0.2 true.cov <- 0.4^(abs(outer(1:p, 1:p, "-"))) # AR1 covariance sam1 <- mvrnorm(n = n1, mu = mu1, Sigma = true.cov) sam2 <- mvrnorm(n = n2, mu = mu2, Sigma = true.cov) apval_Chen2010(sam1, sam2) # the two sample populations have different covariances true.cov1 <- 0.2^(abs(outer(1:p, 1:p, "-"))) true.cov2 <- 0.6^(abs(outer(1:p, 1:p, "-"))) sam1 <- mvrnorm(n = n1, mu = mu1, Sigma = true.cov1) sam2 <- mvrnorm(n = n2, mu = mu2, Sigma = true.cov2) apval_Chen2010(sam1, sam2, eq.cov = FALSE)