mixedTests: Tests for mixed DB clusters

View source: R/mixedTests.R

mixedTestsR Documentation

Tests for mixed DB clusters

Description

Intersects two one-sided tests to determine if a cluster contains tests with changes in both directions.

Usage

mixedTests(
  ids,
  tab,
  weights = NULL,
  pval.col = NULL,
  fc.col = NULL,
  fc.threshold = 0.05
)

mixedClusters(...)

Arguments

ids

An integer vector or factor containing the cluster ID for each test.

tab

A data.frame of results with PValue and at least one logFC field for each test.

weights

A numeric vector of weights for each test. Defaults to 1 for all tests.

pval.col

An integer scalar or string specifying the column of tab containing the p-values. Defaults to "PValue".

fc.col

An integer or string specifying the single column of tab containing the log-fold change.

fc.threshold

A numeric scalar specifying the FDR threshold to use within each cluster for counting tests changing in each direction, see ?"cluster-direction" for more details.

...

Further arguments to pass to mixedTests.

Details

This function converts two-sided p-values to one-sided counterparts for each direction of log-fold change. For each direction, the corresponding one-sided p-values are combined by combineTests to yield a combined p-value for each cluster. Each cluster is associated with two combined p-values (one in each direction), which are intersected using the Berger's intersection-union test (IUT).

The IUT p-value provides evidence against the null hypothesis that either direction is not significant. In short, a low p-value is only possible if there are significant changes in both directions. This formally identifies genomic regions containing complex DB events, i.e., where depletion in one subinterval of the bound/enriched region is accompanied by increasing binding in another subinterval. Examples include swaps in adjacent TF binding locations between conditions or shifts in histone mark patterns in bidirectional promoters.

We expect that the p-values in pval.col are two-sided and independent of the sign of the log-fold change under the null hypothesis. This is true for likelihood ratio tests but may not be true for others (e.g., from glmTreat), so caution is required when supplying values in tab.

Value

A DataFrame with one row per cluster and various fields:

  • An integer field num.tests, specifying the total number of tests in each cluster.

  • Two integer fields num.up.* and num.down.* for each log-FC column in tab, containing the number of tests with log-FCs significantly greater or less than 0, respectively. See ?"cluster-direction" for more details.

  • A numeric field containing the cluster-level p-value. If pval.col=NULL, this column is named PValue, otherwise its name is set to colnames(tab[,pval.col]).

  • A numeric field FDR, containing the BH-adjusted cluster-level p-value.

  • A character field direction, set to "mixed" for all clusters. See ?"cluster-direction" for more details.

  • Two integer fields rep.up.test and rep.down.test, containing the row index (for tab) of representative tests with positive and negative sign, respectively, for each cluster. See ?"cluster-direction" for more details.

  • One numeric field rep.up.* and rep.down.* for each log-FC column in tab, containing log-fold changes for the representative tests in the cluster. See ?"cluster-direction" for more details.

Each row is named according to the ID of the corresponding cluster.

Author(s)

Aaron Lun

References

Berger RL and Hsu JC (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statist. Sci. 11, 283-319.

See Also

combineTests, for a more general-purpose method of combining tests.

Examples

ids <- round(runif(100, 1, 10))
tab <- data.frame(logFC=rnorm(100), logCPM=rnorm(100), PValue=rbeta(100, 1, 2))
mixed <- mixedTests(ids, tab)
head(mixed)


LTLA/csaw documentation built on Dec. 21, 2024, 1:10 a.m.