Description Usage Arguments Details Value Author(s) References See Also Examples
FDR
computes the false discovery rate for comparing gene expression
between two groups of subjects when the distribution of the test statistic
under the null and alternative hypothesis are both mixtures of t-distributions.
CDF
and CDFmix
calculate these mixtures.
1 2 3 4 5 6 7 8 9 |
x |
vector of quantiles (two-sample t-statistics) |
n, n1, n2 |
vector of sample sizes (as subjects per group) |
pmix |
the proportion of non-differentially expressed genes |
D0 |
vector of effect sizes for the null distribution |
p0 |
vector of mixing proportions for |
D1 |
vector of effect sizes for the alternative distribution |
p1 |
vector of mixing proportions for |
D, p |
generic vectors of effect sizes and mixing proportions as above |
sigma |
the standard deviation |
These functions are designed for a simple experimental setup, where we wish to
compare gene expression between two groups of subjects of size n1
and
n2
for an unspecified number of genes, using an equal-variance
t-statistic.
100pmix
% of the genes are assumed to be not differentially
expressed. The corresponding t-statistics follow a mixture of t-distributions;
this is more general than the usual central t-distribution, because we may want
to include genes with biologically small effects under the null hypothesis
(Pawitan et al., 2005). The other 100(1-pmix
)% genes are assumed to be differentially expressed; their t-statistics are also mixtures of t-distributions.
The mixture proportions of t-distributions under the null and alternative
hypothesis are specified via p0
and p1
, respectively. The
individual t-distributions are specified via the means D0
and D1
and the standard deviation sigma
of the underlying data (instead of the mathematically more obvious, but less intuitive non centrality parameters). As the underlying data are the logarithmized expression values, D0
and D1
can be interpreted as average log-fold change between conditions, measured in units of sigma
. See Examples.
CDF
computes the cumulative distribution function for a mixture of
t-distributions based on means D
and standard deviation sigma
with
mixture proportions p
. This function is the work horse for CDFmix
.
Note that the base functions (FDR
, CDFmix
, CDF
) assume two groups of experimental units; the .paired
functions provide the same functionality for one group of paired observations.
The distribution functions call pt
for computation; correspondingly, the quantiles x
and all arguments that define degrees of freedom and non centrality parameters (n1
, n2
, D0
, D1
, sigma
) can be vectors, and will be recycled as necessary.
The appropriate vector of FDRs or probabilities.
Y. Pawitan and A. Ploner
Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A. (2005) False Discovery Rate, Sensitivity and Sample Size for Microarray Studies. Bioinformatics, 21, 3017-3024.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # FDR for H0: 'log fold change is zero'
# vs. H1: 'log fold change is -1 or 1'
# (ie two-fold up- or down regulation)
FDR(1:6, n1=10, n2=10, pmix=0.90, D0=0, p0=1,
D1=c(-1,1), p1=c(0.5, 0.5), sigma=1)
# Include small log fold changes in the H0
# Naturally, this increases the FDR
FDR(1:6, n1=10, n2=10, pmix=0.90, D0=c(-0.25,0, 0.25), p0=c(1/3,1/3,1/3),
D1=c(-1,1), p1=c(0.5, 0.5), sigma=1)
# Consider an asymmetric alternative
# 10 percent of the regulated genes are assumed to be four-fold upregulated
FDR(1:6, n1=10, n2=10, pmix=0.90, D0=0, p0=1,
D1=c(-1,1,2), p1=c(0.45, 0.45, 0.1), sigma=1)
|
[1] 0.76934483 0.47742012 0.21062498 0.09006926 0.04493146 0.02653787
[1] 0.79189874 0.56203058 0.31679204 0.17039513 0.10052234 0.06679318
[1] 0.767207012 0.461640657 0.175697695 0.052703880 0.015690880 0.005424564
Warning messages:
1: In pt(x, df = n1 + n2 - 2, ncp = ncp) :
full precision may not have been achieved in 'pnt{final}'
2: In pt(x, df = n1 + n2 - 2, ncp = ncp) :
full precision may not have been achieved in 'pnt{final}'
3: In pt(x, df = n1 + n2 - 2, ncp = ncp) :
full precision may not have been achieved in 'pnt{final}'
4: In pt(x, df = n1 + n2 - 2, ncp = ncp) :
full precision may not have been achieved in 'pnt{final}'
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.