wMWUTest: An extended Mann-Whitney U test that incorporates...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This is an extension of the two-sample Mann-Whitney U test (a.k.a. rank sum test) which incorporates pre-calculated weights for the correlated observations in the test group. Note that the pre-calculated correlation only applies to the test group (the gene set of interest). The correlation of the background genes is assumed to be zero. Pre-calculated weights are typically computed by function FUNNEL.GSEA().

Usage

1
wMWUTest(test.index, statistics, weight=NULL, correlation=0, df=Inf)

Arguments

test.index

A vector of indices (or names) of genes that belong to the test group (the gene set to be tested).

statistics

A (named) numeric vector contains all the elements (summary statistics or observed values) from *both* groups, such that statitics[test.index] is a vector of statistics of the test set.

weight

A numeric vector, of the same length as test.index, giving weights for the elements in the test group. If NULL, all elements are weighted equally with unit weight.

correlation

An estimate of the correlation in the test group. Genes in the second group are assumed to be independent of each other and of the genes in the test group.

df

Degrees of freedom, based on which the correlation is estimated. For FUNNEL.GSEA, we define df = number of time points -1.

Details

Technical details of this test is documented in Yun Zhang, Juilee Thakar, Xing Qiu (2016) FUNNEL-GSEA: FUNctioNal ELastic-net Regression in time-course Gene Set Enrichment Analysis, submitted to Bioinformatics.

Value

P-values (less, greater) for one-sided left- and right-tail tests, respectively.

Author(s)

Yun Zhang, Juilee Thakar, Xing Qiu

References

Barry, W.T., Nobel, A.B., and Wright, F.A. (2008). A statistical framework for testing functional categories in microarray data. Annals of Applied Statistics, 286-315.

Wu, D, and Smyth, GK (2012). Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research, 40(17), e133-e133.

Yun Zhang, Juilee Thakar, Xing Qiu (2016) FUNNEL-GSEA: FUNctioNal ELastic-net Regression in time-course Gene Set Enrichment Analysis. Submitted to Bioinformatics.

See Also

wilcox.test performs the standard Wilcoxon rank sum test.

rankSumTestWithCorrelation from the limma package performs the correlation extention of the rank sum test.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
set.seed(1)
stat <- rnorm(100)

## We define the first 20 "genes" to be a gene set
test.index <- 1:20

## Add some true signal (>) to the first 8 test genes
stat[1:8] <- stat[1:8]+1 

pL <- wilcox.test(stat[test.index], stat[-test.index], alternative = "less")$p.value
pU <- wilcox.test(stat[test.index], stat[-test.index], alternative = "greater")$p.value

wMWUTest(test.index, stat)
## compare it with the following. pU is what we are looking for
c("less"=pL, "greater"=pU)

## With just 0.1 correlation, p-values are not significant anymore
wMWUTest(test.index, stat, correlation=0.1)

## Our results are equivalent to the implementation provided by limma
library(limma)
rankSumTestWithCorrelation(test.index, stat, correlation=0.1)

## First set of weight: attenuates the signal. With weights<1 for all
## signal-carrying genes, the test is less significant
ww1 <- runif(length(test.index), 0, 1)
wMWUTest(test.index, stat, weight=ww1, correlation=0.1)

## Second set of weight: All the signal-carrying genes have weight==1;
## the rest 12 genes have less weights.  Now the p-value is
## signficant again!
ww2 <- c(rep(1, 8), runif(12, 0, 1))
wMWUTest(test.index, stat, weight=ww2, correlation=0.1)

## In the context of FUNNEL.GSEA
## Load the sample data
data("H3N2-Subj1")

## It takes about 10 minutes to run on my Laptop; YMMV.
## Not run: t1 <- system.time(results1 <- FUNNEL.GSEA(X, tt, genesets=genesets))

    genesets2 <- lapply(genesets, function(z) { intersect(z, rownames(X)) })
    gg1 <- genesets2[["GLYCOLYSIS_GLUCONEOGENESIS"]]
    ww1 <- results1$weight.list[["GLYCOLYSIS_GLUCONEOGENESIS"]]
    rho <- results1$correlation

    ## The test
    test1 <- wMWUTest(gg1, results1$Fstats, ww1, rho, df=15)
    ## p-value for the gene set test
    test1["greater"]

    ## Should be the same as the p-value below
    results1$pvals["GLYCOLYSIS_GLUCONEOGENESIS"]


## End(Not run)

Thakar-Lab/FUNNEL-GSEA-R-Package documentation built on May 8, 2019, 9:57 p.m.