pairwiseWilcox: Perform pairwise Wilcoxon rank sum tests
In scran: Methods for Single-Cell RNA-Seq Data Analysis

Description Usage Arguments Details Value Direction and magnitude of the effect Blocking on uninteresting factors Author(s) References See Also Examples

Perform pairwise Wilcoxon rank sum tests between groups of cells, possibly after blocking on uninteresting factors of variation.

pairwiseWilcox(x, ...)

## S4 method for signature 'ANY'
pairwiseWilcox(
  x,
  groups,
  block = NULL,
  restrict = NULL,
  exclude = NULL,
  direction = c("any", "up", "down"),
  lfc = 0,
  log.p = FALSE,
  gene.names = NULL,
  subset.row = NULL,
  BPPARAM = SerialParam()
)

## S4 method for signature 'SummarizedExperiment'
pairwiseWilcox(x, ..., assay.type = "logcounts")

## S4 method for signature 'SingleCellExperiment'
pairwiseWilcox(x, groups = colLabels(x, onAbsence = "error"), ...)

`x`	A numeric matrix-like object of normalized (and possibly log-transformed) expression values, where each column corresponds to a cell and each row corresponds to an endogenous gene. Alternatively, a SummarizedExperiment or SingleCellExperiment object containing such a matrix.
`...`	For the generic, further arguments to pass to specific methods. For the SummarizedExperiment method, further arguments to pass to the ANY method. For the SingleCellExperiment method, further arguments to pass to the SummarizedExperiment method.
`groups`	A vector of length equal to `ncol(x)`, specifying the group assignment for each cell. If `x` is a SingleCellExperiment, this is automatically derived from `colLabels`.
`block`	A factor specifying the blocking level for each cell.
`restrict`	A vector specifying the levels of `groups` for which to perform pairwise comparisons.
`exclude`	A vector specifying the levels of `groups` for which not to perform pairwise comparisons.
`direction`	A string specifying the direction of differences to be considered in the alternative hypothesis.
`lfc`	Numeric scalar specifying the minimum log-fold change for one observation to be considered to be “greater” than another.
`log.p`	A logical scalar indicating if log-transformed p-values/FDRs should be returned.
`gene.names`	Deprecated, use `row.data` in `findMarkers` instead to add custom annotation.
`subset.row`	See `?"scran-gene-selection"`.
`BPPARAM`	A BiocParallelParam object indicating whether and how parallelization should be performed across genes.
`assay.type`	A string specifying which assay values to use, usually `"logcounts"`.

This function performs Wilcoxon rank sum tests to identify differentially expressed genes (DEGs) between pairs of groups of cells. A list of tables is returned where each table contains the statistics for all genes for a comparison between each pair of groups. This can be examined directly or used as input to combineMarkers for marker gene detection.

The effect size for each gene in each comparison is reported as the area under the curve (AUC). Consider the distribution of expression values for gene X within each of two groups A and B. The AUC is the probability that a randomly selected cell in A has a greater expression of X than a randomly selected cell in B. (Ties are assumed to be randomly broken.) Concordance probabilities near 0 indicate that most observations in A are lower than most observations in B; conversely, probabilities near 1 indicate that most observations in A are higher than those in B. The Wilcoxon rank sum test effectively tests for significant deviations from an AUC of 0.5.

Wilcoxon rank sum tests are more robust to outliers and insensitive to non-normality, in contrast to t-tests in pairwiseTTests. However, they take longer to run, the effect sizes are less interpretable, and there are more subtle violations of its assumptions in real data. For example, the i.i.d. assumptions are unlikely to hold after scaling normalization due to differences in variance. Also note that we approximate the distribution of the Wilcoxon rank sum statistic to deal with large numbers of cells and ties.

If restrict is specified, comparisons are only performed between pairs of groups in restrict. This can be used to focus on DEGs distinguishing between a subset of the groups (e.g., closely related cell subtypes).

If exclude is specified, comparisons are not performed between groups in exclude. Similarly, if any entries of groups are NA, the corresponding cells are are ignored.

A list is returned containing statistics and pairs.

The statistics element is itself a list of DataFrames. Each DataFrame contains the statistics for a comparison between a pair of groups, including the AUCs, p-values and false discovery rates.

The pairs element is a DataFrame with one row corresponding to each entry of statistics. This contains the fields first and second, specifying the two groups under comparison in the corresponding DataFrame in statistics.

In each DataFrame in statistics, the AUC represents the probability of sampling a value in the first group greater than a random value from the second group.

If direction="any", two-sided Wilcoxon rank sum tests will be performed for each pairwise comparisons between groups of cells. Otherwise, one-sided tests in the specified direction will be used instead. This can be used to focus on genes that are upregulated in each group of interest, which is often easier to interpret.

To interpret the setting of direction, consider the DataFrame for group X, in which we are comparing to another group Y. If direction="up", genes will only be significant in this DataFrame if they are upregulated in group X compared to Y. If direction="down", genes will only be significant if they are downregulated in group X compared to Y. See ?wilcox.test for more details on the interpretation of one-sided Wilcoxon rank sum tests.

Users can also specify a log-fold change threshold in lfc to focus on genes that exhibit large shifts in location. This is equivalent to specifying the mu parameter in wilcox.test, with some additional subtleties depending on direction:

If direction="any", the null hypothesis is that the true shift is either -lfc or lfc with equal probability. A two-sided p-value is computed against this composite null. The AUC is computed by averaging the AUCs obtained when X's expression values are shifted to the left or right by lfc. This can be very roughly interpreted as considering an observation of Y to be tied with an observation of X if they differ by less than lfc.
If direction="up", the null hypothesis is that the true shift is lfc, and a one-sided p-value is computed. The AUC is computed after shifting X's expression values to the left by lfc.
If direction="down", the null hypothesis is that the true shift is -lfc, and a one-sided p-value is computed. The AUC is computed after shifting X's expression values to the right by lfc.

The fact that the AUC depends on lfc is necessary to preserve its relationship with the p-value.

If block is specified, Wilcoxon tests are performed between groups of cells within each level of block. For each pair of groups, the p-values for each gene across all levels of block are combined using Stouffer's Z-score method. The reported AUC is also a weighted average of the AUCs across all levels.

The weight for a particular level of block is defined as N_xN_y, where N_x and N_y are the number of cells in groups X and Y, respectively, for that level. This means that p-values from blocks with more cells will have a greater contribution to the combined p-value for each gene.

When combining across batches, one-sided p-values in the same direction are combined first. Then, if direction="any", the two combined p-values from both directions are combined. This ensures that a gene only receives a low overall p-value if it changes in the same direction across batches.

When comparing two groups, blocking levels are ignored if no p-value was reported, e.g., if there were insufficient cells for a group in a particular level. If all levels are ignored in this manner, the entire comparison will only contain NA p-values and a warning will be emitted.

Aaron Lun

Whitlock MC (2005). Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach. J. Evol. Biol. 18, 5:1368-73.

Soneson C and Robinson MD (2018). Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods

wilcox.test, on which this function is based.

combineMarkers, to combine pairwise comparisons into a single DataFrame per group.

getTopMarkers, to obtain the top markers from each pairwise comparison.

library(scuttle)
sce <- mockSCE()
sce <- logNormCounts(sce)

# Any clustering method is okay.
kout <- kmeans(t(logcounts(sce)), centers=2) 

# Vanilla application:
out <- pairwiseWilcox(logcounts(sce), groups=kout$cluster)
out

# Directional and with a minimum log-fold change:
out <- pairwiseWilcox(logcounts(sce), groups=kout$cluster, 
    direction="up", lfc=0.2)
out