Description Usage Arguments Details Value Note References See Also Examples

Calculates p-value of the test for testing equality of two-sample high-dimensional mean vectors proposed by Chen et al (2014) based on the asymptotic distribution of the test statistic.

1 | ```
apval_Chen2014(sam1, sam2, eq.cov = TRUE)
``` |

`sam1` |
an n1 by p matrix from sample population 1. Each row represents a |

`sam2` |
an n2 by p matrix from sample population 2. Each row represents a |

`eq.cov` |
a logical value. The default is |

Suppose that the two groups of *p*-dimensional independent and identically distributed samples *\{X_{1i}\}_{i=1}^{n_1}* and *\{X_{2j}\}_{j=1}^{n_2}* are observed; we consider high-dimensional data with *p \gg n := n_1 + n_2 - 2*. Assume that the covariances of the two sample populations are *Σ_1 = (σ_{1, ij})* and *Σ_2 = (σ_{2, ij})*. The primary object is to test *H_{0}: μ_1 = μ_2* versus *H_{A}: μ_1 \neq μ_2*. Let *\bar{X}_{k}* be the sample mean for group *k = 1, 2*. For a vector *v*, we denote *v^{(i)}* as its *i*th element.

Chen et al (2014) proposed removing estimated zero components in the mean difference through thresholding; they considered

*T_{CLZ}(s) = ∑_{i = 1}^{p} ≤ft\{ \frac{(\bar{X}_1^{(i)} - \bar{X}_2^{(i)})^2}{σ_{1,ii}/n_1 + σ_{2,ii}/n_2} - 1 \right\} I ≤ft\{ \frac{(\bar{X}_1^{(i)} - \bar{X}_2^{(i)})^2}{σ_{1,ii}/n_1 + σ_{2,ii}/n_2} > λ_{p} (s) \right\},*

where the threshold level is *λ_p(s) := 2 s \log p* and *I(\cdot)* is the indicator function. Since an optimal choice of the threshold is unknown, they proposed trying all possible threshold values, then choosing the most significant one as their final test statistic:

*T_{CLZ} = \max_{s \in (0, 1 - η)} \{ T_{CLZ}(s) - \hat{μ}_{T_{CLZ}(s), 0}\}/\hat{σ}_{T_{CLZ}(s), 0},*

where *\hat{μ}_{T_{CLZ}(s), 0}* and *\hat{σ}_{T_{CLZ}(s), 0}* are estimates of the mean and standard deviation of *T_{CLZ}(s)* under the null hypothesis. They derived its asymptotic null distribution as an extreme value distribution.

A list including the following elements:

`sam.info` |
the basic information about the two groups of samples, including the samples sizes and dimension. |

`cov.assumption` |
the equality assumption on the covariances of the two sample populations; this was specified by the argument |

`method` |
this output reminds users that the p-values are obtained using the asymptotic distributions of test statistics. |

`pval` |
the p-value of the test proposed by Chen et al (2014). |

This function does not transform the data with their precision matrix (see Chen et al, 2014). To calculate the p-value of the test statisic with transformation, users can use transformed samples for `sam1`

and `sam2`

.

Chen SX, Li J, and Zhong PS (2014). "Two-Sample Tests for High Dimensional Means with Thresholding and Data Transformation." arXiv preprint arXiv:1410.2848.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
library(MASS)
set.seed(1234)
n1 <- n2 <- 50
p <- 200
mu1 <- rep(0, p)
mu2 <- mu1
mu2[1:10] <- 0.2
true.cov <- 0.4^(abs(outer(1:p, 1:p, "-"))) # AR1 covariance
sam1 <- mvrnorm(n = n1, mu = mu1, Sigma = true.cov)
sam2 <- mvrnorm(n = n2, mu = mu2, Sigma = true.cov)
apval_Chen2014(sam1, sam2)
# the two sample populations have different covariances
true.cov1 <- 0.2^(abs(outer(1:p, 1:p, "-")))
true.cov2 <- 0.6^(abs(outer(1:p, 1:p, "-")))
sam1 <- mvrnorm(n = n1, mu = mu1, Sigma = true.cov1)
sam2 <- mvrnorm(n = n2, mu = mu2, Sigma = true.cov2)
apval_Chen2014(sam1, sam2, eq.cov = FALSE)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.