classifytestsF: Genewise Nested F-Tests
In limma: Linear Models for Microarray Data

Description Usage Arguments Details Value Author(s) References See Also Examples

For each gene, classify a series of related t-statistics as significantly up or down using nested F-tests.

1	classifyTestsF(object, cor.matrix = NULL, df = Inf, p.value = 0.01, fstat.only = FALSE)

`object`	numeric matrix of t-statistics or an `MArrayLM` object from which the t-statistics may be extracted.
`cor.matrix`	covariance matrix of each row of t-statistics. Will be extracted automatically from an `MArrayLM` object but otherwise defaults to the identity matrix.
`df`	numeric vector giving the degrees of freedom for the t-statistics. May have length 1 or length equal to the number of rows of `tstat`. Will be extracted automatically from an `MArrayLM` object but otherwise default to `Inf`.
`p.value`	numeric value between 0 and 1 giving the desired size of the test.
`fstat.only`	logical, if `TRUE` then return the overall F-statistic as for `FStat` instead of classifying the test results.

classifyTestsF implements the "nestedF" multiple testing option offered by decideTests. Users should generally use decideTests rather than calling classifyTestsF directly because, by itself, classifyTestsF does not incorporate any multiple testing adjustment across genes. Instead it simply tests across contrasts for each gene individually.

classifyTestsF uses a nested F-test approach giving particular attention to correctly classifying genes that have two or more significant t-statistics, i.e., which are differentially expressed in two or more conditions. For each row of tstat, the overall F-statistics is constructed from the t-statistics as for FStat. At least one constrast will be classified as significant if and only if the overall F-statistic is significant. If the overall F-statistic is significant, then the function makes a best choice as to which t-statistics contributed to this result. The methodology is based on the principle that any t-statistic should be called significant if the F-test is still significant for that row when all the larger t-statistics are set to the same absolute size as the t-statistic in question.

Compared to conventional multiple testing methods, the nested F-test approach achieves better consistency between related contrasts. (For example, if B is judged to be different from C, then at least one of B or C should be different to A.) The approach was first used by Michaud et al (2008). The nested F-test approach provides weak control of the family-wise error rate, i.e., it correctly controls the type I error rate of calling any contrast as significant if all the null hypotheses are true. In other words, it provides error rate control at the overall F-test level but does not provide strict error rate control at the individual contrast level.

Usually object is a limma linear model fitted object, from which a matrix of t-statistics can be extracted, but it can also be a numeric matrix of t-statistics. In either case, rows correspond to genes and columns to coefficients or contrasts. If object is a matrix, then it may be necessary to supply values for cor.matrix and df. The cor.matrix is the same as the correlation matrix of the coefficients from which the t-statistics were calculated and df is the degrees of freedom of the t-statistics. All statistics for the same gene must have the same degrees of freedom.

If fstat.only=TRUE, the classifyTestsF just returns the vector of overall F-statistics for each gene.

If fstat.only=FALSE, then an object of class TestResults is returned. This is essentially a numeric matrix with elements -1, 0 or 1 depending on whether each t-statistic is classified as significantly negative, not significant or significantly positive respectively.

If fstat.only=TRUE, then a numeric vector of F-statistics is returned with attributes df1 and df2 giving the corresponding degrees of freedom.

Gordon Smyth

Michaud, J, Simpson, KM, Escher, R, Buchet-Poyau, K, Beissbarth, T, Carmichael, C, Ritchie, ME, Schutz, F, Cannon, P, Liu, M, Shen, X, Ito, Y, Raskind, WH, Horwitz, MS, Osato, M, Turner, DR, Speed, TP, Kavallaris, M, Smyth, GK, and Scott, HS (2008). Integrative analysis of RUNX1 downstream pathways and target genes. BMC Genomics 9, 363.

An overview of multiple testing functions is given in 08.Tests.

TStat <- matrix(c(0,10,0, 0,5,0, -4,-4,4, 2,2,2), 4, 3, byrow=TRUE)
colnames(TStat) <- paste0("Contrast",1:3)
rownames(TStat) <- paste0("Gene",1:4)
classifyTestsF(TStat, df=20)
FStat <- classifyTestsF(TStat, df=20, fstat.only=TRUE)
P <- pf(FStat, df1=attr(FStat,"df1"), df2=attr(FStat,"df2"), lower.tail=FALSE)
data.frame(F.Statistic=FStat,P.Value=P)

TestResults matrix
      Contrast1 Contrast2 Contrast3
Gene1         0         1         0
Gene2         0         1         0
Gene3        -1        -1         1
Gene4         0         0         0
      F.Statistic      P.Value
Gene1   33.333333 5.636548e-08
Gene2    8.333333 8.586213e-04
Gene3   16.000000 1.533944e-05
Gene4    4.000000 2.207700e-02