cc02-0-MultiWilcoxonTest-class: Class "MultiWilcoxonTest"

MultiWilcoxonTest-classR Documentation

Class "MultiWilcoxonTest"

Description

The MultiWilcoxonTest class is used to perform row-by-row Wilcoxon rank-sum tests on a data matrix. Significance cutoffs are determined by the empirical Bayes method of Efron and Tibshirani.

Usage

MultiWilcoxonTest(data, classes, histsize=NULL)
## S4 method for signature 'MultiWilcoxonTest'
summary(object, prior=1, significance=0.9, ...)
## S4 method for signature 'MultiWilcoxonTest'
hist(x, xlab='Rank Sum',
 ylab='Prob(Different | Y)', main='', ...)
## S4 method for signature 'MultiWilcoxonTest,missing'
plot(x, prior=1, significance=0.9,
 ylim=c(-0.5, 1), xlab='Rank Sum', ylab='Prob(Different | Y)', ...)
## S4 method for signature 'MultiWilcoxonTest'
cutoffSignificant(object, prior, significance, ...)
## S4 method for signature 'MultiWilcoxonTest'
selectSignificant(object, prior, significance, ...)
## S4 method for signature 'MultiWilcoxonTest'
countSignificant(object, prior, significance, ...)
## S4 method for signature 'MultiWilcoxonTest'
probDiff(object, p0, ...)

Arguments

data

either a data frame or matrix with numeric values, or an ExpressionSet as defined in the BioConductor tools for analyzing microarray data.

classes

If data is a data frame or matrix, then classes must be either a logical vector or a factor. If data is an ExpressionSet, then classes can be a character string that names one of the factor columns in the associated phenoData subobject.

histsize

An integer; the number of bins used for the histogram summarizing the Wilcoxon statistics. When NULL, each discrete rank-sum value gets its own bin.

object

an object of the MultiWilcoxonTest class.

x

an object of the MultiWilcoxonTest class.

xlab

character string specifying label for the x axis

ylab

character string specifying label for the y axis

ylim

Plotting limits on the y-axis

main

character string specifying graph title

p0

see prior.

prior

Prior probability that an arbitrary gene is not differentially expressed, or that an arbitrary row does not yield a significant Wilcoxon rank-sum statistic.

significance

Desired level of posterior probability

...

extra arguments for generic or plotting routines

Details

See the paper by Efron and Tibshirani.

Value

The standard methods summary, hist, and plot return what you would expect.

The cutoffSignificant method returns a list of two integers. Rank-sum values smaller than the first value or larger than the second value are statistically significant in the sense that their posterior probability exceeds the specified significance level given the assumptions about the prior probability of not being significant.

The selectSignificant method returns a vector of logical values identifying the significant test results, and countSignificant returns an integer counting the number of significant test results.

Creating Objects

As usual, objects can be created by new, but better methods are available in the form of the MultiWilcoxonTest function. The inputs to this function are the same as those used for row-by-row statistical tests throughout the ClassComparison package; a detailed description can be found in the MultiTtest class.

The constructor computes row-by-row Wilcoxon rank-sum statistics on the input data, comparing the two groups defined by the classes argument. It also estimates the observed and theoretical (expected) density functions for the collection of rank-sum statistics.

The additional input argument, histsize is usually best left to its default value. In certain pathological cases, we have found it necessary to use fewer bins; one suspects that the underlying model does not adequately capture the complexity of those situations.

Slots

statistics:

numeric vector containing the computed rank-sum statistics.

xvals:

numeric vector, best thought of as the vector of possible rank-sum statistics given the sizes of the two groups.

theoretical.pdf:

numeric vector containing the theoretical density function evaluated at the points of xvals.

pdf:

numeric vector containing the empirical density function computed at the points of xvals.

unravel:

numeric vector containing a smoothed estimate (by Poisson regression using B-splines) of the empirical density function evaluated at xvals.

groups:

A vector containing the names of the groups defined by classes.

call:

object of class call representing the function call that created the object.

Methods

summary(object, prior=1, significance=0.9, ...)

Write out a summary of the object. For a given value of the prior probability of not being differentially expressed and a given significance cutoff on the posterior probability, reports the cutoffs and number of items in both tails of the distribution.

hist(x, xlab='Rank Sum', ylab='Prob(Different|Y)', main=”, ...)

Plot a histogram of the rank-sum statistics, with overlaid curves representing the expected and observed distributions. Colors of the curves are controlled by oompaColor$EXPECTED and oompaColor$OBSERVED.

plot(x, prior=1, significance=0.9, ylim=c(-0.5, 1), xlab='Rank Sum', ylab='Prob(Different | Y)', ...)

Plots the posterior probability of being differentially expressed for given values of the prior. Horizontal lines are added at each specified significance level for the posterior probability.

cutoffSignificant(object, prior, significance, ...)

Determine cutoffs on the rank-sum statistic at the desired significance level.

selectSignificant(object, prior, significance, ...)

Compute a logical vector for selecting significant test results.

countSignificant(object, prior, significance, ...)

Count the number of significant test results.

probDiff(object, p0, ...)

Compute the probabilty that an observed value comes from the "unusual" part of the mixture distribution. Only exported so it can be inherited by other classes....

Author(s)

Kevin R. Coombes krc@silicovore.com

References

Efron B, Tibshirani R.
Empirical bayes methods and false discovery rates for microarrays.
Genet Epidemiol 2002, 23: 70-86.

Pounds S, Morris SW.
Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values.
Bioinformatics. 2003 Jul 1;19(10):1236-42.

See Also

Implementation is handled in part by the functions dwil and rankSum. The empirical Bayes results for alternative tests (such as MultiTtest) can be obtained using the beta-uniform mixture model in the Bum class.

Examples

showClass("MultiWilcoxonTest")
ng <- 10000
ns <- 15
nd <- 200
fake.class <- factor(rep(c('A', 'B'), each=ns))
fake.data <- matrix(rnorm(ng*ns*2), nrow=ng, ncol=2*ns)
fake.data[1:nd, 1:ns] <- fake.data[1:nd, 1:ns] + 2
fake.data[(nd+1):(2*nd), 1:ns] <- fake.data[(nd+1):(2*nd), 1:ns] - 2

a <- MultiWilcoxonTest(fake.data, fake.class)
hist(a)
plot(a)
plot(a, prior=0.85)
abline(h=0)

cutoffSignificant(a, prior=0.85, signif=0.95)
countSignificant(a, prior=0.85, signif=0.95)

ClassComparison documentation built on Sept. 11, 2024, 7:01 p.m.