Description Usage Arguments Details Value Note Author(s) References See Also Examples
Computes and plots the operating characteristics for a two group microarray experiment based on a theoretical model. The false discovery rate (FDR) is plotted against the cutoff level on the tstatistic. Optionally, curves for the the classical significance level and sensitivity can be added. Different curves for different proportions of nondifferentially expressed genes can be compared in the same plot, and the sample size per group can be varied between plots.
1 2 3 
n, n1, n2 
number of samples per group, by default equal and specified via 
p0 
the proportion of not differentially expressed genes, may be vector valued 
sigma 
the standard deviation for the log expression values 
D 
assumed average log fold change (in units of 
F0 
the distribution of the log2 expression values under the null hypothesis; by default, this is normal with mean zero and standard deviation 
F1 
the distribution of the log2 expression values under the alternative hypothesis; by default, this is an equal mixture of two normals with means 
paired 
logical value indicating whether two distinct groups of observations or one group of paired observations are studied. 
plot 
logical value indicating whether the results should be plotted. 
local.show 
logical value indicating whether to show local or global false discovery rate (default: global). 
alpha.show 
logical value indicating whether to show the classical significance level for testing one hypothesis as a function of the cutoff level. 
sensitivity.show 
logical value indicating whether to show the classical sensitivity for testing one hypothesis as a function of the cutoff level. 
nplot 
number of points that are evaluated for the curves 
xlim 
the usual limits on the horizontal axis 
ylim 
the usual limits on the vertical axis 
main 
the main title of the plot 
legend.show 
logical value indicating whether to show a legend for the different types of curves in the plot. 
... 
the usual graphical parameters, passed to 
This function plots the FDR as a function of the cutoff level when comparing the expression of multiple genes between two groups of subjects. We study a gene selection mechanism that declares all genes to be differentially expressed whose tstatistics have an absolute value greater than a specified cutoff value. The comparison is based on a twosample tstatistic for equal variances, for either paired or unpaired observations.
The underlying model assumes that a proportion p0
of genes are not differentially expressed between groups, and that 1p0
are. The logarithmized gene expression values are assumed to be generated by mixtures of normal distributions. Both null and alternative hypothesis are specified through the means of the respective mixture components; these means can be interpreted as average log2 fold changes in units of the standard deviation sigma
.
Note that the model does not assume that all genes have the same standard deviation sigma
, only that the mean log2 fold change for all regulated genes is proportional to their individual variability (standard deviation). sigma
generally does not need to be specified explicitly and can be left at its default value of one, so that D
can be interpreted straightforward as log2 fold change between groups.
The default null distribution of the log2 expression values is a single normal distribution with mean zero (and standard deviation sigma
); the default alternative distribution is is an equal mixture of two normals with means D
and D
(and again standard deviation sigma
). However, general mixtures of normals can be specified for both null and alternative distribution through F0
and F1
, respectively: both are lists with two elements:
D
is the vector of means (i.e. log2 fold changes),
p
is the vector of mixing proportions for the means.
If present, p
must be the same length as D
; its elements do not
need to be normalized, i.e. sum to one; if absent, equal mixing is assumed, see Examples. A wide (mixture) null hypothesis, or an empirical null hypothesis as outlined by Efron (2004), can be used if genes with log fold changes close to zero are thought to be of no biological interest, and are counted as effectively not regulated. Similarly, the alternative hypothesis can be any mixture of large and small effects, symmetric or nonsymmetric, depending on the expected regulation patterns, see Examples.
As a consequence, both the null distribution of the tstatistics (for the unregulated genes) and their alternative distribution (for the regulated genes) are mixtures of (generally noncentral) tdistributions, see FDR
.
Sample size n
and standard deviation sigma
are atomic values, but multiple p0
can be specified, resulting in multiple curves. Additionally, the usual significance level and sensitivity for a classical onehypothesis can be displayed.
This function returns invisibly a data frame with nplot
rows whose columns contain the information for the individual curves. The number of columns and their names will depend on the number and value of the p0
specified, and whether alpha and sensitivity are displayed. Additionally, the returned data frame has an attribute param
, which is a list with all the nonplotting arguments to the function.
Both the curve labels and the legend may be squashed if the plotting device is too small. Increasing the size of the device and replotting should improve readability.
Y. Pawitan and A. Ploner
Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A. (2005) False Discovery Rate, Sensitivity and Sample Size for Microarray Studies. Bioinformatics, 21, 30173024.
Efron, B. (2004) LargeScale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis. JASA, 99, 96104.
FDR
, samplesize
, EOC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  # Default null and alternative distributions, assuming different proportions
# of regulated genes
TOC(p0=c(0.90, 0.95, 0.99), legend.show=TRUE)
# The effect of sample size and effect size
par(mfrow=c(2,2))
TOC(p0=c(0.90, 0.95, 0.99), n=5, D=1)
TOC(p0=c(0.90, 0.95, 0.99), n=30, D=1)
TOC(p0=c(0.90, 0.95, 0.99), n=5, D=2)
TOC(p0=c(0.90, 0.95, 0.99), n=30, D=2)
# A wide null distribution that allows to disregard genes of small effect
# unspecified p means equal mixing proportions
ret = TOC(F0=list(D=c(0.25,0,0.25)), main="Wide F0")
attr(ret,"param")$F0 # the null hypothesis
# An extended (and unsymmetric) alternative
ret = TOC(F1=list(D=c(2,1,1), p=c(1,2,2)), p0=0.95, main="Unsymmetric F1")
attr(ret,"param")$F1 # F1$p is normalized
# Unequal sample sizes
TOC(n1=10, n2=30)
# Curves for a paired ttest
TOC(paired=TRUE)
# The output contains all the x and ycoordinates
ret = TOC(p0=c(0.90, 0.95, 0.99), main="Default settings")
dim(ret)
colnames(ret)
ret[1:10,]
# Additionally, the list of arguments that determine the experiment
attr(ret,"param")

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.