SAM Analysis Using Wilcoxon Rank Statistics
Description
Generates the required statistics for a Significance Analysis of Microarrays
analysis using standardized Wilcoxon rank statistics.
Should not be called directly, but via sam(..., method = wilc.stat).
Usage
 wilc.stat(data, cl, gene.names = NULL, R.fold = 1, use.dm = FALSE,
R.unlog = TRUE, na.replace = TRUE, na.method = "mean",
approx50 = TRUE, ties.method=c("min","random","max"),
use.row = FALSE, rand = NA)

Arguments
data 
a matrix or a data frame. Each row of data must correspond to a variable (e.g., a gene),
and each column to a sample (i.e.\ an observation).

cl 
a numeric vector of length ncol(data) containing the class
labels of the samples. In the two class paired case, cl can also
be a matrix with ncol(data) rows and 2 columns. For details
on how cl should be specified, see ?sam .

gene.names 
a character vector of length nrow(data) containing the
names of the genes.

R.fold 
a numeric value. If the fold change of a gene is smaller than or
equal to R.fold , or larger than or equal to 1/R.fold ,respectively,
then this gene will be excluded from the SAM analysis. The expression score
d of excluded genes is set to NA . By default, R.fold
is set to 1 such that all genes are included in the SAM analysis. Setting
R.fold to 0 or a negative value will avoid the computation of the fold
change. The fold change is only computed in the twoclass unpaired case.

use.dm 
if TRUE , the fold change is computed by 2 to the power of the difference between
the mean log2 intensities of the two groups, i.e.\ 2 to the power of the numerator of the test statistic.
If FALSE , the fold change is determined
by computing 2 to the power of data (if R.unlog = TRUE ) and then calculating the ratio of the
mean intensity in the group coded by 1 to the mean intensity in the group coded
by 0. The latter is the default, as this definition of the fold change is used in
Tusher et al.\ (2001).

R.unlog 
if TRUE , the antilog of data will be used in the computation of the
fold change. Otherwise, data is used. This transformation should be done
if data is log2tranformed. (In a SAM analysis, it is highly recommended
to use log2transformed expression data.) Ignored if use.dm = TRUE .

na.replace 
if TRUE , missing values will be removed by the genewise/rowwise
statistic specified by na.method . If a gene has less than 2 nonmissing
values, this gene will be excluded from further analysis. If na.replace = FALSE ,
all genes with one or more missing values will be excluded from further analysis.
The expression score d of excluded genes is set to NA .

na.method 
a character string naming the statistic with which missing values
will be replaced if na.replace=TRUE . Must be either "mean" (default)
or median .

approx50 
if TRUE , the null distribution will be approximated by
the standard normal distribution. Otherwise, the exact null distribution is
computed. This argument will automatically be set to FALSE if there
are less than 50 samples in each of the groups.

ties.method 
either "min" (default), "random" , or "max" . If
"random" , the ranks of ties are randomly assigned. If "min" or "max" ,
the ranks of ties are set to the minimum or maximum rank, respectively. For details,
see the help of rank . If use.row = TRUE , ties.method = "max"
will be used. For the handling of Zeros, see Details.

use.row 
if TRUE , rowWilcoxon is used to compute the Wilcoxon
rank statistics.

rand 
numeric value. If specified, i.e. not NA , the random number generator
will be set into a reproducible state.

Details
Standardized versions of the Wilcoxon rank statistics are computed. This means that
W* = (W  mean(W)) / sd(W) is used as expression
score d, where W is the usual Wilcoxon rank sum statistic or Wilcoxon
signed rank statistic, respectively.
In the computation of these statistics, the ranks of ties are by default set to the
minimum rank. In the computation of the Wilcoxon signed rank statistic, zeros are randomly
set either to a very small positive or negative value.
If there are less than 50 observations in each of the groups, the exact null distribution
will be used. If there are more than 50 observations in at least one group, the null
distribution will by default be approximated by the standard normal distribution. It is,
however, still possible to compute the exact null distribution by setting approx50
to FALSE
.
Value
A list containing statistics required by sam
.
Author(s)
Holger Schwender, holger.schw@gmx.de
References
Schwender, H., Krause, A. and Ickstadt, K. (2003). Comparison of
the Empirical Bayes and the Significance Analysis of Microarrays.
Technical Report, SFB 475, University of Dortmund, Germany.
Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays
applied to the ionizing radiation response. PNAS, 98, 51165121.
See Also
SAMclass
,sam
, wilc.ebam