Description Usage Arguments Details Value Creating Objects Slots Methods Author(s) References See Also Examples
The Bum
class is used to fit a betauniform mixture model to a
set of pvalues.
1 2 3 4 5 6 7 8 9 10 11 12 13 14  Bum(pvals, ...)
## S4 method for signature 'Bum'
summary(object, tau=0.01, ...)
## S4 method for signature 'Bum'
hist(x, res=100, xlab='P Values', main='', ...)
## S4 method for signature 'Bum'
image(x, ...)
## S4 method for signature 'Bum'
cutoffSignificant(object, alpha, by='FDR', ...)
## S4 method for signature 'Bum'
selectSignificant(object, alpha, by='FDR', ...)
## S4 method for signature 'Bum'
countSignificant(object, alpha, by='FDR', ...)
likelihoodBum(object)

pvals 
numeric vector containing values between 
object 
object of class 
tau 
numeric scalar between 
x 
object of class 
res 
positive integer scalar specifying the resolution at which to plot the fitted distribution curve 
xlab 
character string specifying the label for the x axis 
main 
character string specifying the graph title 
alpha 
Either the false discovery rate (if 
by 
character string denoting the method to use for determining cutoffs. Valid values are:

... 
extra arguments for generic or plotting routines 
The BUM method was introduced by Stan Pounds and Steve Morris, although it was simultaneously discovered by several other researchers. It is generally applicable to any analysis of microarray or proteomics data that performs a separate statistical hypothesis test for each gene or protein, where each test produces a pvalue that would be valid if the analyst were only performing one statistical test. When performing thousands of statistical tests, however, those pvalues no longer have the same interpretation as Type I error rates. The idea behind BUM is that, under the null hypothesis that none of the genes or proteins is interesting, the expected distribution of the set of pvalues is uniform. By contrast, if some of the genes are interesting, then we should see an overabundance of small pvalues (or a spike in the histogram near zero). We can model the alternative hypothesis with a beta distribution, and view the set of all pvalues as a mixture distribution.
Fitting the BUM model is straightforward, using a nonlinear optimizer to compute the maximum likelihood parameters. After the model has been fit, one can easily determine cutoffs on the pvalues that correspond to desired false discovery rates. Alternatively, the original Pounds and Morris paper shows that their results can be reinterpreted to recover the empirical Bayes method introduced by Efron and Tibshirani. Thus, one can also determine cutoffs by specifying a desired posterior probability of significance.
Graphical functions (hist
and image
) invisibly return the
object on which they were invoked.
The cutoffSignificant
method returns a real number between zero
and one. Pvalues below this cutoff are considered statistically
significant at either the specified false discovery rate or at the
specified posterior probability.
The selectSignificant
method returns a vector of logical values
whose length is equal to the length of the vector of pvalues that was
used to construct the Bum
object. True values in the return
vector mark the statistically significant pvalues.
The countSignificant
method returns an integer, the number of
statistically significant pvalues.
The summary
method returns an object of class
BumSummary
.
Although objects can be created directly using new
, the most
common usage will be to pass a vector of pvalues to the
Bum
function.
pvals
:numeric vector of pvalues used to construct the object.
ahat
:Model parameter
lhat
:Model parameter
pihat
:Model parameter
For each value of the pvalue
cutoff tau
, computes estimates of the fraction of true
positives (TP), false negatives (FN), false positives (FP), and
true negatives (TN).
Plots a
histogram of the object, and overlays (1) a straight line to indicate
the contribution of the uniform component and (2) the fitted
betauniform distribution from the observed values. Colors in the
plot are controlled by
oompaColor$EXPECTED
and
oompaColor$OBSERVED
.
Produces four plots in a 2x2 layout: (1) the
histogram produced by hist
; (2) a plot of cutoffs against
the desired false discovery rate; (3) a plot of cutoffs against
the posterior probability of coming from the beta component; and
(4) an ROC curve.
Computes the
cutoff needed for significance, which in this case means arising
from the beta component rather than the uniform component of the
mixture. Significance is specified either by the false discovery
rate (when by = 'FDR'
or by = 'FalseDiscovery'
) or
by the posterior probability (when by = 'EmpiricalBayes'
)
Uses
cutoffSignificant
to determine a logical vector that
indicates which of the pvalues are significant.
Uses
selectSignificant
to count the number of significant
pvalues.
Kevin R. Coombes krc@silicovore.com
Pounds S, Morris SW.
Estimating the occurrence of false positives and false negatives in
microarray studies by approximating and partitioning the empirical
distribution of pvalues.
Bioinformatics. 2003 Jul 1;19(10):123642.
Benjamini Y, Hochberg Y.
Controlling the false discovery rate: a practical and powerful approach
to multiple testing.
J Roy Statist Soc B, 1995; 57: 289300.
Efron B, Tibshirani R.
Empirical bayes methods and false discovery rates for microarrays.
Genet Epidemiol 2002, 23: 7086.
Two classes that produce lists of pvalues that can (and often
should) be analyzed using BUM are MultiTtest
and
MultiLinearModel
. Also see BumSummary
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27  showClass("Bum")
fake.data < c(runif(700), rbeta(300, 0.3, 1))
a < Bum(fake.data)
hist(a, res=200)
alpha < (1:25)/100
plot(alpha, cutoffSignificant(a, alpha, by='FDR'),
xlab='Desired False Discovery Rate', type='l',
main='FDR Control', ylab='Significant P Value')
GAMMA < 5*(10:19)/100
plot(GAMMA, cutoffSignificant(a, GAMMA, by='EmpiricalBayes'),
ylab='Significant P Value', type='l',
main='Empirical Bayes', xlab='Posterior Probability')
b < summary(a, (0:100)/100)
be < b@estimates
sens < be$TP/(be$TP+be$FN)
spec < be$TN/(be$TN+be$FP)
plot(1spec, sens, type='l', xlim=c(0,1), ylim=c(0,1), main='ROC Curve')
points(1spec, sens)
abline(0,1)
image(a)
countSignificant(a, 0.05, by='FDR')
countSignificant(a, 0.99, by='Emp')

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.