Description Usage Arguments Details Value Note Author(s) References See Also Examples
EOC
computes and optionally plots the estimated operating characteristics for data from a microarray experiment with two groups of subjects. The false discovery rate (FDR) is estimated based on random permutations of the data and plotted against the cutoff level on the t-statistic; a curve for the classical sensitivity can be added. Different curves for different proportions of non-differentially expressed genes can be compared in the same plot, and the sample size per group can be varied between plots.
FDRp
is the function that does the underlying hard work and requires package multtest
.
1 2 3 |
xdat |
the matrix of expression values, with genes as rows and samples as columns |
grp |
a grouping variable giving the class membership of each sample, i.e. each column in |
p0 |
if supplied, an estimate for the proportion of non-differentially expressed genes; if not supplied, the routine will estimate it, see Details. |
paired |
logical value indicating whether this is independent sample situation (default) or a paired sample situation. Note that paired samples need to follow each other in the data matrix (as in 010101... |
when paired=TRUE
.
nperm |
number of permutations for establishing the null distribution of the t-statistic |
test |
the type of test to use, see |
seed |
the random seed from which the permutations are started |
plot |
logical value indicating whether to do the plot |
... |
graphical parameters, passed to |
EOC
is the empirical counterpart of the function TOC
. It estimates the FDR and sensitivity for a given data set of expression values measured on subjects in two groups. The FDR is estimated locally based on the empirical Bayes approach outlined by Efron et al., see References. FDRp
implements the details of this method; this requires among other things the permutation distribution of the t-statistic, which is calculated via a call to function mt.teststat
of package multtest
. This explains why both functions barf at missing values in the expression data.
Note that p0
is by default estimated from the data, as originally suggested by Efron et al. so as to make ratio between the densities of the observed distribution of t-statistics and the permutation distribution smaller than 1; alternatively, the user can supply his own guesstimate of the proportion of non-differentially expressed genes in the data.
Note also that FDRp
keeps all permuations in the memory during compuations. For a large number of genes, this will limit the number of possible permuations.
For EOC
, an object of class FDR.result
, which inherits from class data.frame
. The three columns list for each gene its t-statistic, the estimated FDR (two-sided), and the estimated sensitivity. Additionally, the object carries an attribute param
, which is a list with four entries: p0
, the assumed proportion of non-differentially expressed genes used in calculating the FDR; p0.est
, a logical value indicating whether p0
was estimated or user-supplied; statistic
indicates how the t-statistic was computed, i.e. how its sign should be interpreted in terms of relative over- or under expression, and a logical flag paired
to indicate whether a paired t-statistic was used.
FDRp
returns a list with essentially the same elements, plus additionally the values of the observed and permuted distribution of the t-statistics for each gene.
Both the curve labels and the legend may be squashed if the plotting device is too small. Increasing the size of the device and re-plotting should improve readability.
Y. Pawitan and A. Ploner
Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A (2005) False Discovery Rate, Sensitivity and Sample Size for Microarray Studies. Bioinformatics, 21, 3017-3024.
Efron B, Tibshirani R, Storey JD, Tusher V. (2001) Empirical Bayes Analysis of a Microarray Experiment. JASA, 96(456), p. 1151-60.
plot.FDR.result
, OCshow
, mt.teststat
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | # We simulate a small example with 5 percent regulated genes and
# a rather large effect size
set.seed(2003)
xdat = matrix(rnorm(50000), nrow=1000)
xdat[1:25, 1:25] = xdat[1:25, 1:25] - 2
xdat[26:50, 1:25] = xdat[26:50, 1:25] + 2
grp = rep(c("Sample A","Sample B"), c(25,25))
# The default, with legend
ret = EOC(xdat, grp, legend=TRUE)
# Look at the results: yes
ret[1:10,]
which(ret$FDR<0.05)
# Extra information
attr(ret,"param")
# Run the same data with different permutations: fairly stable, but with
# different p0
ret = EOC(xdat, grp, seed=2000)
which(ret$FDR<0.07)
# Misspecify the p0: not too bad here
ret = EOC(xdat, grp, p0=0.99)
which(ret$FDR<0.01)
# We simulate data in a paired setting
# Note the arrangement of the columns
set.seed(2004)
xdat = matrix(rnorm(50000), nrow=1000)
ndx1 = seq(1,50, by=2)
xdat[1:25, ndx1] = xdat[1:25, ndx1] - 2
xdat[26:50, ndx1] = xdat[26:50, ndx1] + 2
grp = rep(c("Sample A","Sample B"), 25)
ret = EOC(xdat, grp, paired=TRUE)
which(ret$FDR<0.05)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.