Nothing
#' Reporting cluster-level direction in \pkg{csaw}
#'
#' An overview of the strategies used to obtain cluster-level summaries of the direction of change,
#' based on the directionality information of individual tests.
#' This is relevant to all functions that aggregate per-test statistics into a per-cluster summary,
#' e.g., \code{\link{combineTests}}, \code{\link{minimalTests}}.
#' It assumes that there are zero, one or many columns of log-fold changes in the data.frame of per-test statistics,
#' typically specified using a \code{fc.cols} argument.
#'
#' @section Counting the per-test directions:
#' For each cluster, we will report the number of tests that are up (positive values) or down (negative) for each column of log-fold change values listed in \code{fc.col}.
#' This provide some indication of whether the change is generally positive or negative - or both - across tests in the cluster.
#' If a cluster contains non-negligble numbers of both up and down tests, this indicates that there may be a complex differential event within that cluster (see comments in \code{\link{mixedTests}}).
#'
#' To count up/down tests, we apply a multiple testing correction to the p-values \emph{within} each cluster.
#' Only the tests with adjusted p-values no greater than \code{fc.threshold} are counted as being up or down.
#' We can interpret this as a test of conditional significance; assuming that the cluster is interesting (i.e., contains at least one true positive), what is the distribution of the signs of the changes within that cluster?
#' Note that this procedure has no bearing on the p-value reported for the cluster itself.
#'
#' The nature of the per-test correction within each cluster varies with each function.
#' In most cases, there is a per-test correction that naturally accompanies the per-cluster p-value:
#' \itemize{
#' \item For \code{\link{combineTests}}, the Benjamini-Hochberg correction is used.
#' \item For \code{\link{minimalTests}}, the Holm correction is used.
#' \item For \code{\link{getBestTest}} with \code{by.pval=TRUE}, the Holm correction is used.
#' We could also use the Bonferroni correction here but Holm is universally more powerful so we use that instead.
#' \item For \code{\link{getBestTest}} with \code{by.pval=FALSE},
#' all tests bar the one with the highest abundance are simply ignored,
#' which mimics the application of an independent filter.
#' No correction is applied as only one test remains.
#' \item For \code{\link{mixedTests}} and \code{\link{empiricalFDR}}, the Benjamini-Hochberg correction is used,
#' given that both functions just call \code{\link{combineTests}} on the one-sided p-values in each direction.
#' Here, the number of up tests is obtained using the one-sided p-values for a positive change;
#' similarly, the number of down tests is obtained using the one-sided p-values for a negative change.
#' }
#'
#' @section Representative tests and their log-fold changes:
#' For each combining procedure, we identify a representative test for the entire cluster.
#' This is based on the observation that, in each method,
#' there is often one test that is especially important for computing the cluster-level p-value.
#' \itemize{
#' \item For \code{\link{combineTests}}, the representative is the test with the lowest BH-adjusted p-value before enforcing monotonicity.
#' This is because the p-value for this test is directly used as the combined p-value in Simes' method.
#' \item For \code{\link{minimalTests}}, the test with the \eqn{x}th-smallest p-value is used as the representative.
#' See the function's documentation for the definition of \eqn{x}.
#' \item For \code{\link{getBestTest}} with \code{by.pval=TRUE}, the test with the lowest p-value is used.
#' \item For \code{\link{getBestTest}} with \code{by.pval=FALSE}, the test with the highest abundance is used.
#' \item For \code{\link{mixedTests}}, two representative tests are reported in each direction.
#' The representative test in each direction is defined using \code{\link{combineTests}} as described above.
#' \item For \code{\link{empiricalFDR}}, the test is chosen in the same manner as described for \code{\link{combineTests}}
#' after converting all p-values to their one-sided counterparts in the \dQuote{desirable} direction,
#' i.e., up tests when \code{neg.down=TRUE} and down tests otherwise.
#' }
#'
#' The index of the associated test is reported in the output as the \code{"rep.test"} field along with its log-fold changes.
#' For clusters with simple differences, the log-fold change for the representative is a good summary of the effect size for the cluster.
#'
#' @section Determining the cluster-level direction:
#' When only one log-fold change column is specified, we will try to determine which direction contributes to the combined p-value.
#' This is done by tallying the directions of all tests with (weighted) p-values below that of the representative test.
#' If all tests in a cluster have positive or negative log-fold changes, that cluster's direction is reported as \code{"up"} or \code{"down"} respectively; otherwise it is reported as \code{"mixed"}.
#' This is stored as the \code{"direction"} field in the returned data frame.
#'
#' Assessing the contribution of per-test p-values to the cluster-level p-value is roughly equivalent to asking whether the latter would increase if all tests in one direction were assigned p-values of unity.
#' If there is an increase, then tests changing in that direction must contribute to the combined p-value calculations.
#' In this manner, clusters are labelled based on whether their combined p-values are driven by tests with only positive, negative or mixed log-fold changes.
#' (Note that this interpretation is not completely correct for \code{\link{minimalTests}} due to equality effects from enforcing monotonicity in the Holm procedure, but this is of little practical consequence.)
#'
#' Users should keep in mind that the label only describes the direction of change among the most significant tests in the cluster.
#' Clusters with complex differences may still be labelled as changing in only one direction, if the tests changing in one direction have much lower p-values than the tests changing in the other direction (even if both sets of p-values are significant).
#' More rigorous checks for mixed changes should be performed with \code{\link{mixedTests}}.
#'
#' There are several functions for which the \code{"direction"} is set to a constant value:
#' \itemize{
#' \item For \code{\link{mixedTests}}, it is simply set to \code{"mixed"} for all clusters.
#' This reflects the fact that the reported p-value represents the evidence for mixed directionality in this function;
#' indeed, the field itself is simply reported for consistency, given that we already know we are looking for mixed clusters!
#' \item For \code{\link{empiricalFDR}}, it is set to \code{"up"} when \code{neg.down=FALSE} and \code{"down"} otherwise.
#' This reflects the fact that the empirical FDR reflects the significance of changes in the desired direction.
#' }
#'
#' @author Aaron Lun
#'
#' @seealso
#' \code{\link{combineTests}}, \code{\link{minimalTests}}, \code{\link{getBestTest}},
#' \code{\link{empiricalFDR}} annd \code{\link{mixedTests}} for the functions that do the work.
#' @name cluster-direction
NULL
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.