highD2pop-package: Two-sample tests for equality of means in high dimension
In highD2pop: Two-Sample Tests for Equality of Means in High Dimension

Description Details Author(s) References Examples

This package provides functions for the tests from Gregory et al. (2015), Chen and Qin (2010), Srivastava and Kubokawa (2013), and Cai, Liu, and Xia (2014) for the equal means hypothesis in the high-dimensional, two-population setting. These are used to test

H_0: \boldsymbol{μ}_1 = \boldsymbol{μ}_2

H_1: \boldsymbol{μ}_1 \neq \boldsymbol{μ}_2

when the number of components in the mean vectors exceeds the sample size, that is in the large-p-small-n setting.

Package:	highD2pop
Type:	Package
Version:	1.0
Date:	2012-11-02
License:	GPL (>=2)

The functions GCT.test, ChenQin.test and SK.test, CLX.test.equalcov and CLX.test.unequalcov, accept n by p and m by p data matrices with sample data from the first and second populations and return test statistics and p-values for the null hypothesis of equal means. The build2popData function simulates high-dimensional data in the two-population setting with specified sample sizes, numbers of components, covariance structure, etc., and the functions GCT.sim, ChenQin.sim, SK.sim, and CLX.sim.Covtest return test statistic values and p-values for lists of simulated data sets generated by build2popData. The CLX.Covtest function tests for equality of covariance matrices between the two populations with the test proposed in Cai, Liu, and Xia (2013). The GCT.test.missing is a version of the generalized component test which accomodates missing values and returns overall and componentwise missingness summaries. The functions rdblepareto and rgammashift generate realizations from heavy-tailed and skewed distributions under which the relative performance of the four tests is of interest.

Karl Gregory Maintainer: Karl Gregory <kgregory@mail.uni-mannheim.de>

Cai, T., Liu, W. and Luo, X. (2011). A constrained l-1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association 106, 594607.

Cai, T., Liu, W. and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery. Journal of the American Statistical Association 108, 265277.

Cai, T. T., Liu, W. and Xia, Y. (2014). Two-sample test of high dimensional means under dependence. J. R. Statist. Soc. B.

Chen, X. S. and Qin, Y.L. (2010). A two sample test for high dimensional data with applications to gene-set testing. The Annals of Statistics. 38(2):808–835

Gregory, K., Carroll, R. J., Baladandayuthapani, V. and Lahiri, S. (2015). A two-sample test for equality of means in high dimension. Journal of the American Statistician, to appear

Hall, P. Jing, B. Y. and Lahiri, S. N. (1998). On the sampling window method for long-range dependent data. Statistica Sinica 8,1189–1204

Srivastava, M. S. and Kubokawa, T. (2013). Tests for multivariate analysis of variance in high dimension under non-normality. Journal of Multivariate Analysis 115, 204216.

	
	
data(chr1qseg)

impute<-function(x) { 	x[which(is.na(x))] <- mean(x,na.rm=TRUE)
						return(x)
					}

X <- apply(chr1qseg$X,2,impute)
Y <- apply(chr1qseg$Y,2,impute)

## on imputed data with no missing values:

ChenQin.test(X,Y)
GCT.test(X,Y,r=20,smoother="parzen")
SK.test(X,Y)

## on raw data with missing values:

GCT.test.missing(chr1qseg$X,chr1qseg$Y,r=20,smoother="parzen")