Home

/

GitHub

/

multtest: Multiple testing correction for the Global Test

Features for Association with a Response Variable, with Applications to Gene Set Testing

Description Usage Arguments Details Value Note Author(s) References See Also Examples

A collection of multiple testing procedures for the Global Test. Methods for the focus level procedure of Goeman and Mansmann for graph-structured hypotheses, and for the inheritance procedure based on Meinshausen.

# The focus level method:
focusLevel (test, sets, focus, ancestors, offspring,
           stop = 1, atoms = TRUE, trace)

findFocus (sets, ancestors, offspring, maxsize = 10, atoms = TRUE)


# The inheritance method:
inheritance (test, sets, weights, ancestors, offspring, Shaffer,
            homogeneous = TRUE, trace)


# Utilities for focus level and inheritance method:
leafNodes (object, alpha=0.05, type = c("focuslevel","inheritance"))

draw (object, alpha = 0.05, type = c("focuslevel","inheritance"),
      names=FALSE, sign.only = FALSE, interactive = FALSE)

`object`	A `gt.object`, usually one in which more than one test was performed.
`test`	Either a `function` or `gt.object`. If a function, that function should take as its argument a vector of covariate labels, and return (raw) p-value. See the examples below. If a `gt.object` the call to `gt` that created it must have had all the covariates of `sets` (below) in its `alternative` argument.
`sets`	A named `list` representing covariate sets of the hypotheses of interest, for which adjusted p-values are to be calculated. If it is missing but `test` is a `gt.object`, the `subsets` slot of that object will be used. If used in the `inheritance`, `sets` describe a tree structure of hypotheses. In this case, object of class `hclust` or `dendrogram`.
`focus`	The focus level of the focus level method. Must be a subset of `names(sets)`. Represents the level of the graph at which the method is focused, i.e. has most power.
`ancestors`	An environment or list that maps each set in `sets` to all its ancestors, i.e. its proper supersets. If missing, `ancestors` is determined from the input of `offspring`, or, if that is also missing, from the input of `sets` (time-consuming).
`offspring`	An environment or list that maps each set in `sets` to all its offspring, i.e. its proper subsets. If missing, `offspring` is determined from the input of `ancestors`, or, if that is also missing, from the input of `sets` (time-consuming).
`stop`	Determines when to stop the algorithm. If `stop` is set to a value smaller than or equal to 1, the algorithm only calculates familywise error rate corrected p-values of at most `stop`. If `stop` is set to a value greater than 1, the algorithm stops when it has rejected at least `stop` hypotheses. If set to exactly 1, the algorithm calculates all familywise error rate corrected p-values. Corrected p-values that are not calculated are reported as `NA`.
`atoms`	If set to `TRUE`, the focus level algorithm partitions the offspring of each focus level set into the smallest possible building blocks, called atoms. Doing this often greatly accelerates computation, but sometimes at the cost of some power.
`trace`	If set to `TRUE`, reports progress information. The default is obtained from `gt.options()$trace`. Alternatively, setting `trace = 2` gives much more extensive output (`focusLevel` only).
`maxsize`	Parameter to choose the height of the focus level. The focus level sets are chosen in such a way that the number of tests that is to be done for each focus level set is at most `2^maxsize - 1`.
`alpha`	The alpha level of familywise error control for the significant subgraph.
`Shaffer`	If set to `TRUE`, it applys the Shaffer improvement. If `Shaffer` is `NULL` and `object` is a `gt.object` the procedure checks whether `Shaffer=TRUE` is valid, and sets the value accordingly.
`weights`	Optional weights vector for the leaf nodes. If it is missing but `test` is a `gt.object`, the result of `weights(object)` will be used. In all other cases `weights` is set to be uniform among all leaf nodes.
`homogeneous`	If set to `TRUE`, redistributes the alpha of rejected leaf node hypotheses homogeneously over the hypotheses under test, rather than to closest related hypotheses.
`type`	Argument for specifying which multiple testing correction method should be used. Only relevant if both the inheritance and the focuslevel procedures were performed on the same set of test results.
`names`	If set to `TRUE`, draws the graph with node names rather than numbers.
`sign.only`	If set to `TRUE`, draws only the subgraph corresponding to the significant nodes. If `FALSE`, draws the full graph with the non-significant nodes grayed out.
`interactive`	If set to `TRUE`, creates an interactive graph in which the user can see the node label by clicking on the node.

Multiple testing correction becomes important if the Global Test is performed on many covariate subsets.

If the hypotheses are structured in such a way that many of the tested subsets are subsets of other sets, more powerful procedures can be applied that take advantage of this structure to gain power. Two methods are implemented in the globaltest package: the inheritance method for tree-structured hypotheses and the focusLevel method for general directed acyclic graphs. For simple multiple testing that does not use such structure, see p.adjust.

The focusLevel procedure makes use of the fact that some sets are subsets or supersets of each other, as specified by the user in the offspring and ancestors arguments. Viewing the subset and superset structure as a graph, the procedure starts testing at a focus level: a subset of the nodes of the graph. If the procedure finds significance at this focus level, it proceeds to find significant subsets and supersets of the focus level sets. Like Holm's procedure, the focus level procedure is valid regardless of the correlation structure between the test statistics.

The focus level method requires the choice of a “focus level” in the graph. The findFocus function is a utility function for automatically choosing a focus level. It chooses a collection of focus level sets in such a way that the number of tests to be done for each focus level node is at most 2^maxsize. In practice this usually means that each focus level node has at most maxsize leaf nodes as offspring. Choosing focus level nodes with too many offspring nodes may result in excessively long computation times.

The inheritance method is an alternative method for calculating familywise error rate corrected p-values. Like the focus level method, inheritance also makes use of the structure of the tested sets to gain power. In this case, however, the graph is restricted to a tree, as can be obtained for example if the tested subsets are obtained from a hierarchical clustering. The inheritance procedure is used in the covariates function. Like Holm's method and the focus level method, the inheritance procedure makes no assumptions on the joint distribution of the test statistics.

The leafNodes function extracts the leaf nodes of the significant subgraph after a focus level procedure was performed. As this graph is defined by its leaf nodes, this is the most efficient summary of the test result. Only implemented for gt.object input.

The draw function draws the graph, displaying the significant nodes. It either draws the full graph with the non-significant nodes grayed out (sign.only = TRUE), or it draws only the subgraph corresponding to the significant nodes.

See the vignette for extensive applications.

The function multtest returns an object of class gt.object with an appropriate column added to the test results matrix.

The focusLevel and inheritance functions returns a gt.object if a gt.object argument was given as input, otherwise it returns a matrix with a column of raw p-values and a column of corrected p-values.

The function leafNodes returns a gt.object.

findFocus returns a character vector.

In the graph terminology of the focus level method, ancestor means superset, and offspring means subset.

The validity of the focus level procedure depends on certain assumptions on the null hypothesis that is tested for each set. See the paper by Goeman and Mansmann (cited below) for the precise assumptions. Similar assumptions are necessary for the Shaffer improvement of the inheritance procedure.

Jelle Goeman: j.j.goeman@lumc.nl; Livio Finos

The methods used by multtest:

Holm (1979) A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6: 65-70.

Benjamini and Hochberg (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57: 289-300.

Benjamini and Yekutieli (2001) The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29 (4) 1165-1188.

The focus level method:

Goeman and Mansmann (2008) Multiple testing on the directed acyclic graph of gene ontology. Bioinformatics 24 (4) 537-544.

The inheritance method:

Meinshausen (2008) Hierarchical testing of variable importance. Biometrika 95 (2), 265-278.

For references related to applications of the test, see the vignette GlobalTest.pdf included with this package.

The gt function. The gt.object function and useful functions associated with that object.

Many more examples in the vignette!

    # Simple examples with random data here
    # Real data examples in the Vignette

    # Random data: covariates A,B,C are correlated with Y
    set.seed(1)
    Y <- rnorm(20)
    X <- matrix(rnorm(200), 20, 10)
    X[,1:3] <- X[,1:3] + Y
    colnames(X) <- LETTERS[1:10]

    # Some subsets of interest
    my.sets1 <- list(abc = LETTERS[1:3], cde  = LETTERS[3:5],
                     fgh = LETTERS[6:8], hij = LETTERS[8:10])
    res <- gt(Y, X, subsets = my.sets1)

    # Simple multiple testing
    p.adjust(res)
    p.adjust(res, "BH")

    # A whole structure of sets
    my.sets2 <- as.list(LETTERS[1:10])
    names(my.sets2) <- letters[1:10]
    my.sets3 <- list(all = LETTERS[1:10])
    my.sets <- c(my.sets2,my.sets1,my.sets3)

    # Do the focus level procedure
    # Choose a focus level by hand
    my.focus <- c("abc","cde","fgh","hij")
    # Or automated
    my.focus <- findFocus(my.sets, maxsize = 8)
    resF <- focusLevel(res, sets = my.sets, focus = my.focus)
    leafNodes(resF, alpha = .1)

    # Compare
    p.adjust(resF, "holm")

    # Focus level with a custom test
    Ftest <- function(set) anova(lm(Y~X[,set]))[["Pr(>F)"]][1]
    focusLevel(Ftest, sets=my.sets, focus=my.focus)

    # analyze data using inheritance procedure
    res <- gt(Y, X, subsets = list(colnames(X)))
    # define clusters on the covariates X
    hcl=hclust(dist(t(X)))
    # Do inheritance procedure
    resI=inheritance(res, sets = hcl)
    resI
    leafNodes(resI, alpha = .1)

    # inheritance procedure with a custom test
    inheritance(Ftest, sets = hcl, Shaffer=TRUE)

jellegoeman/globaltest documentation built on Dec. 29, 2021, 9:11 p.m.

jellegoeman/globaltest index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jellegoeman/globaltest
Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

multtest: Multiple testing correction for the Global Test
In jellegoeman/globaltest: Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to multtest in jellegoeman/globaltest...

R Package Documentation

Browse R Packages

We want your feedback!

jellegoeman/globaltest Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

multtest: Multiple testing correction for the Global Test In jellegoeman/globaltest: Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to multtest in jellegoeman/globaltest...

R Package Documentation

Browse R Packages

We want your feedback!

jellegoeman/globaltest
Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

multtest: Multiple testing correction for the Global Test
In jellegoeman/globaltest: Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing