cluster-direction: Reporting cluster-level direction in 'csaw'

Description Counting the per-test directions Representative tests and their log-fold changes Determining the cluster-level direction Author(s) See Also

Description

An overview of the strategies used to obtain cluster-level summaries of the direction of change, based on the directionality information of individual tests. This is relevant to all functions that aggregate per-test statistics into a per-cluster summary, e.g., combineTests, minimalTests. It assumes that there are zero, one or many columns of log-fold changes in the data.frame of per-test statistics, typically specified using a fc.cols argument.

Counting the per-test directions

For each cluster, we will report the number of tests that are up (positive values) or down (negative) for each column of log-fold change values listed in fc.col. This provide some indication of whether the change is generally positive or negative - or both - across tests in the cluster. If a cluster contains non-negligble numbers of both up and down tests, this indicates that there may be a complex differential event within that cluster (see comments in mixedTests).

To count up/down tests, we apply a multiple testing correction to the p-values within each cluster. Only the tests with adjusted p-values no greater than fc.threshold are counted as being up or down. We can interpret this as a test of conditional significance; assuming that the cluster is interesting (i.e., contains at least one true positive), what is the distribution of the signs of the changes within that cluster? Note that this procedure has no bearing on the p-value reported for the cluster itself.

The nature of the per-test correction within each cluster varies with each function. In most cases, there is a per-test correction that naturally accompanies the per-cluster p-value:

Representative tests and their log-fold changes

For each combining procedure, we identify a representative test for the entire cluster. This is based on the observation that, in each method, there is often one test that is especially important for computing the cluster-level p-value.

The index of the associated test is reported in the output as the "rep.test" field along with its log-fold changes. For clusters with simple differences, the log-fold change for the representative is a good summary of the effect size for the cluster.

Determining the cluster-level direction

When only one log-fold change column is specified, we will try to determine which direction contributes to the combined p-value. This is done by tallying the directions of all tests with (weighted) p-values below that of the representative test. If all tests in a cluster have positive or negative log-fold changes, that cluster's direction is reported as "up" or "down" respectively; otherwise it is reported as "mixed". This is stored as the "direction" field in the returned data frame.

Assessing the contribution of per-test p-values to the cluster-level p-value is roughly equivalent to asking whether the latter would increase if all tests in one direction were assigned p-values of unity. If there is an increase, then tests changing in that direction must contribute to the combined p-value calculations. In this manner, clusters are labelled based on whether their combined p-values are driven by tests with only positive, negative or mixed log-fold changes. (Note that this interpretation is not completely correct for minimalTests due to equality effects from enforcing monotonicity in the Holm procedure, but this is of little practical consequence.)

Users should keep in mind that the label only describes the direction of change among the most significant tests in the cluster. Clusters with complex differences may still be labelled as changing in only one direction, if the tests changing in one direction have much lower p-values than the tests changing in the other direction (even if both sets of p-values are significant). More rigorous checks for mixed changes should be performed with mixedTests.

There are several functions for which the "direction" is set to a constant value:

Author(s)

Aaron Lun

See Also

combineTests, minimalTests, getBestTest, empiricalFDR annd mixedTests for the functions that do the work.


csaw documentation built on Nov. 12, 2020, 2:03 a.m.