mixedTests: Tests for mixed DB clusters
In csaw: ChIP-Seq Analysis with Windows

Description Usage Arguments Details Value Author(s) References See Also Examples

Intersects two one-sided tests to determine if a cluster contains tests with changes in both directions.

mixedTests(
  ids,
  tab,
  weights = NULL,
  pval.col = NULL,
  fc.col = NULL,
  fc.threshold = 0.05
)

mixedClusters(...)

`ids`	An integer vector or factor containing the cluster ID for each test.
`tab`	A data.frame of results with `PValue` and at least one `logFC` field for each test.
`weights`	A numeric vector of weights for each test. Defaults to 1 for all tests.
`pval.col`	An integer scalar or string specifying the column of `tab` containing the p-values. Defaults to `"PValue"`.
`fc.col`	An integer or string specifying the single column of `tab` containing the log-fold change.
`fc.threshold`	A numeric scalar specifying the FDR threshold to use within each cluster for counting tests changing in each direction, see `?"cluster-direction"` for more details.
`...`	Further arguments to pass to `mixedTests`.

This function converts two-sided p-values to one-sided counterparts for each direction of log-fold change. For each direction, the corresponding one-sided p-values are combined by combineTests to yield a combined p-value for each cluster. Each cluster is associated with two combined p-values (one in each direction), which are intersected using the Berger's intersection-union test (IUT).

The IUT p-value provides evidence against the null hypothesis that either direction is not significant. In short, a low p-value is only possible if there are significant changes in both directions. This formally identifies genomic regions containing complex DB events, i.e., where depletion in one subinterval of the bound/enriched region is accompanied by increasing binding in another subinterval. Examples include swaps in adjacent TF binding locations between conditions or shifts in histone mark patterns in bidirectional promoters.

We expect that the p-values in pval.col are two-sided and independent of the sign of the log-fold change under the null hypothesis. This is true for likelihood ratio tests but may not be true for others (e.g., from glmTreat), so caution is required when supplying values in tab.

A DataFrame with one row per cluster and various fields:

An integer field num.tests, specifying the total number of tests in each cluster.
Two integer fields num.up.* and num.down.* for each log-FC column in tab, containing the number of tests with log-FCs significantly greater or less than 0, respectively. See ?"cluster-direction" for more details.
A numeric field containing the cluster-level p-value. If pval.col=NULL, this column is named PValue, otherwise its name is set to colnames(tab[,pval.col]).
A numeric field FDR, containing the BH-adjusted cluster-level p-value.
A character field direction, set to "mixed" for all clusters. See ?"cluster-direction" for more details.
Two integer fields rep.up.test and rep.down.test, containing the row index (for tab) of representative tests with positive and negative sign, respectively, for each cluster. See ?"cluster-direction" for more details.
One numeric field rep.up.* and rep.down.* for each log-FC column in tab, containing log-fold changes for the representative tests in the cluster. See ?"cluster-direction" for more details.

Each row is named according to the ID of the corresponding cluster.

Aaron Lun

Berger RL and Hsu JC (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statist. Sci. 11, 283-319.

combineTests, for a more general-purpose method of combining tests.

ids <- round(runif(100, 1, 10))
tab <- data.frame(logFC=rnorm(100), logCPM=rnorm(100), PValue=rbeta(100, 1, 2))
mixed <- mixedTests(ids, tab)
head(mixed)