wilcoxTestPerRow: Run Wilcoxon rank-sum test (Mann-Whitney) on each row of a...
In jvanheld/stats4bioinfo: Utilities for the book "Statistics for bioinformatics"

Description Usage Arguments Details Value Author(s) Examples

Apply Wilcoxon rank-sum test (also called Mann-Whitney test) to each row of a data frame containing multivariate data for two samples.

wilcoxTestPerRow(x, cl, P.threshold = NA, E.threshold = NA,
  FDR.threshold = NA, robust.est = F, verbosity = 1,
  volcanoPlot = FALSE, alternative = "two.sided", group.col.names = FALSE,
  test.group = cl[1], alpha = 0.05, native.test = TRUE, ...)

`x`	A matrix or data frame
`cl`	A vector describing class assignment (length should equal the number of columns of the data table)
`...`	Additional parameters are passed to the function tTestPerRow.plotVolcano()
`P.threshold=NA`	p-value threshold. If specified, the result table only contains rows passing this threshold.
`E.threshold=NA`	e-value threshold. If specified, the result table only contains rows passing this threshold.
`FDR.threshold=NA`	Threshold on the False Discovery Rate (FDR). If specified, the result table only contains rows passing this threshold.
`robust.est=F`	Use robust estimators for central tendency and dispersion
`verbosity=1`	Level of verbosity
`volcanoPlot=T`	Draw a volcano plot.
`alternative="two-sided"`	Alternative hypothesis for the wilcox.test. Supported: "two.sided" (default), "less", "greater".
`test.group=cl[1]`	Specify which group should be considered as first term for the difference (d=m_{test}-m_{others}). By default the first label of the class vector (cl) is considered as test group.
`group.col.name`	Include group labels in the column name of the output table
`alpha=0.05`	threshold to declare a feature positive. The threshold can be applied on any of the following statistics:

First version: 2003-09 Last modification: 2015-02

A data.frame with one row per test result, and one column per statistics.

Jacques van Helden (Jacques.van-Helden@univ-amu.fr)

## Load example data set from Den Boer, 2009
library(denboer2009)
data(denboer2009.expr)     ## Load expression table
data(denboer2009.pheno)    ## Load phenotypic data
data(denboer2009.group.labels)    ## Load phenotypic data

## Print cancer types and associated group labels
print(data.frame(denboer2009.group.labels))

## Compute the number of samples per subtype of cancer (ALL)
sort(table(denboer2009.pheno$sample.labels), decreasing=TRUE)

## Create a vector with group labels per sample,
## For the Welch test we compare one group of interest (e.g. Bh)
## to all the other ones (labeled as "other").
goi <- "Bh" ## Group of interest
sample.groups <- denboer2009.pheno$sample.labels
sample.groups[sample.groups != goi] <- "other"

## Check number of samples per group
sort(table(sample.groups))

## Run Welch test on each row of the DenBoer dataset
system.time(wilcox.result <- wilcoxTestPerRow(x=denboer2009.expr, cl=sample.groups, test.group="Bh"))

## Draw a volcano plot with Welch result table
VolcanoPlot(multitest.table=wilcox.result$table, control.type="e.value", alpha=0.05, 
     effect.size.col="U.diff", xlab="U1 - U2",
     main=paste(sep="", "Wilcoxon rank-sum test: Den Boer (2009), ", goi, " vs others"),
     legend.corner = "topleft")

## Run Welch test on each row of the DenBoer dataset
welch.result <- tTestPerRow(x=denboer2009.expr, cl=sample.groups, test.group="Bh", var.equal=FALSE)

## Compare e-values from Wilcoxon and Welch tests
plotPvalCompa(data.frame(
   "Wilcoxon"=wilcox.result$table$e.value,
   "Welch"=welch.result$table$e.value), score="e-value", alpha=0.05)
   
## Compare FDR from Wilcoxon and Welch tests
plotPvalCompa(data.frame(
   "Wilcoxon"=wilcox.result$table$fdr,
   "Welch"=welch.result$table$fdr), score="FDR", alpha=0.05,
   main="Wilcoxon versus Welch (FDR)")
   
## Confusion table between Wilcoxon and Welch tests
table(wilcox.result$table$fdr < 0.05, welch.result$table$fdr < 0.05) ## Lenient threshold on FDR
table(wilcox.result$table$e.value < 0.05, welch.result$table$e.value < 0.05) ## Lenient threshold on E-value
table(wilcox.result$table$e.value < 1, welch.result$table$e.value < 1) ## Intermediate threshold on E-value