pairwiseWilcox | R Documentation |
Perform pairwise Wilcoxon rank sum tests between groups of cells, possibly after blocking on uninteresting factors of variation.
pairwiseWilcox(x, ...)
## S4 method for signature 'ANY'
pairwiseWilcox(
x,
groups,
block = NULL,
restrict = NULL,
exclude = NULL,
direction = c("any", "up", "down"),
lfc = 0,
log.p = FALSE,
gene.names = NULL,
subset.row = NULL,
BPPARAM = SerialParam()
)
## S4 method for signature 'SummarizedExperiment'
pairwiseWilcox(x, ..., assay.type = "logcounts")
## S4 method for signature 'SingleCellExperiment'
pairwiseWilcox(x, groups = colLabels(x, onAbsence = "error"), ...)
x |
A numeric matrix-like object of normalized (and possibly log-transformed) expression values, where each column corresponds to a cell and each row corresponds to an endogenous gene. Alternatively, a SummarizedExperiment or SingleCellExperiment object containing such a matrix. |
... |
For the generic, further arguments to pass to specific methods. For the SummarizedExperiment method, further arguments to pass to the ANY method. For the SingleCellExperiment method, further arguments to pass to the SummarizedExperiment method. |
groups |
A vector of length equal to |
block |
A factor specifying the blocking level for each cell. |
restrict |
A vector specifying the levels of |
exclude |
A vector specifying the levels of |
direction |
A string specifying the direction of differences to be considered in the alternative hypothesis. |
lfc |
Numeric scalar specifying the minimum log-fold change for one observation to be considered to be “greater” than another. |
log.p |
A logical scalar indicating if log-transformed p-values/FDRs should be returned. |
gene.names |
Deprecated, use |
subset.row |
See |
BPPARAM |
A BiocParallelParam object indicating whether and how parallelization should be performed across genes. |
assay.type |
A string specifying which assay values to use, usually |
This function performs Wilcoxon rank sum tests to identify differentially expressed genes (DEGs) between pairs of groups of cells.
A list of tables is returned where each table contains the statistics for all genes for a comparison between each pair of groups.
This can be examined directly or used as input to combineMarkers
for marker gene detection.
The effect size for each gene in each comparison is reported as the area under the curve (AUC). Consider the distribution of expression values for gene X within each of two groups A and B. The AUC is the probability that a randomly selected cell in A has a greater expression of X than a randomly selected cell in B. (Ties are assumed to be randomly broken.) Concordance probabilities near 0 indicate that most observations in A are lower than most observations in B; conversely, probabilities near 1 indicate that most observations in A are higher than those in B. The Wilcoxon rank sum test effectively tests for significant deviations from an AUC of 0.5.
Wilcoxon rank sum tests are more robust to outliers and insensitive to non-normality, in contrast to t-tests in pairwiseTTests
.
However, they take longer to run, the effect sizes are less interpretable, and there are more subtle violations of its assumptions in real data.
For example, the i.i.d. assumptions are unlikely to hold after scaling normalization due to differences in variance.
Also note that we approximate the distribution of the Wilcoxon rank sum statistic to deal with large numbers of cells and ties.
If restrict
is specified, comparisons are only performed between pairs of groups in restrict
.
This can be used to focus on DEGs distinguishing between a subset of the groups (e.g., closely related cell subtypes).
If exclude
is specified, comparisons are not performed between groups in exclude
.
Similarly, if any entries of groups
are NA
, the corresponding cells are are ignored.
A list is returned containing statistics
and pairs
.
The statistics
element is itself a list of DataFrames.
Each DataFrame contains the statistics for a comparison between a pair of groups,
including the AUCs, p-values and false discovery rates.
The pairs
element is a DataFrame with one row corresponding to each entry of statistics
.
This contains the fields first
and second
,
specifying the two groups under comparison in the corresponding DataFrame in statistics
.
In each DataFrame in statistics
, the AUC represents the probability of sampling a value in the first
group greater than a random value from the second
group.
If direction="any"
, two-sided Wilcoxon rank sum tests will be performed for each pairwise comparisons between groups of cells.
Otherwise, one-sided tests in the specified direction will be used instead.
This can be used to focus on genes that are upregulated in each group of interest, which is often easier to interpret.
To interpret the setting of direction
, consider the DataFrame for group X, in which we are comparing to another group Y.
If direction="up"
, genes will only be significant in this DataFrame if they are upregulated in group X compared to Y.
If direction="down"
, genes will only be significant if they are downregulated in group X compared to Y.
See ?wilcox.test
for more details on the interpretation of one-sided Wilcoxon rank sum tests.
Users can also specify a log-fold change threshold in lfc
to focus on genes that exhibit large shifts in location.
This is equivalent to specifying the mu
parameter in wilcox.test
with some additional subtleties depending on direction
:
If direction="any"
, the null hypothesis is that the true shift lies in [-lfc, lfc]
.
Two one-sided p-values are computed against the boundaries of this interval by shifting X's expression values to either side by lfc
,
and these are combined to obtain a (conservative) two-sided p-value.
If direction="up"
, the null hypothesis is that the true shift is less than or equal to lfc
.
A one-sided p-value is computed against the boundary of this interval.
If direction="down"
, the null hypothesis is that the true shift is greater than or equal to -lfc
.
A one-sided p-value is computed against the boundary of this interval.
The AUC is conveniently derived from the U-statistic used in the test, which ensures that it is always consistent with the reported p-value.
An interesting side-effect is that the reported AUC is dependent on both the specified lfc
and direction
.
If direction="up"
, the AUC is computed after shifting X's expression values to the left by the specified lfc
.
An AUC above 0.5 means that X's values are “greater” than Y's, even after shifting down the former by lfc
.
This is helpful as a large AUC tells us that X and Y are well-separated by at least lfc
.
However, an AUC below 0.5 cannot be interpreted as “X is lower than Y”, only that “X - lfc is lower than Y”.
If direction="down"
, the AUC is computed after shifting X's expression values to the right by the specified lfc
.
An AUC below 0.5 means that X's values are “lower” than Y's, even after shifting up the former by lfc
.
This is helpful as a low AUC tells us that X and Y are well-separated by at least lfc
.
However, an AUC above 0.5 cannot be interpreted as “X is greater than Y”, only that “X + lfc is greater than Y”.
If direction="any"
, the AUC is computed by averaging the AUCs obtained in each of the two one-sided tests, i.e., after shifting in each direction.
This considers an observation of Y to be tied with an observation of X if their absolute difference is less than lfc
.
(Technically, the test procedure should also consider these to be ties to be fully consistent, but we have not done so for simplicity.)
The AUC can be interpreted as it would be for lfc=0
, i.e., above 0.5 means that X is greater than Y and below 0.5 means that X is less than Y.
If block
is specified, Wilcoxon tests are performed between groups of cells within each level of block
.
For each pair of groups, the p-values for each gene across all levels of block
are combined using Stouffer's Z-score method.
The reported AUC is also a weighted average of the AUCs across all levels.
The weight for a particular level of block
is defined as N_xN_y
,
where N_x
and N_y
are the number of cells in groups X and Y, respectively, for that level.
This means that p-values from blocks with more cells will have a greater contribution to the combined p-value for each gene.
When combining across batches, one-sided p-values in the same direction are combined first.
Then, if direction="any"
, the two combined p-values from both directions are combined.
This ensures that a gene only receives a low overall p-value if it changes in the same direction across batches.
When comparing two groups, blocking levels are ignored if no p-value was reported, e.g., if there were insufficient cells for a group in a particular level.
If all levels are ignored in this manner, the entire comparison will only contain NA
p-values and a warning will be emitted.
Aaron Lun
Whitlock MC (2005). Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach. J. Evol. Biol. 18, 5:1368-73.
Soneson C and Robinson MD (2018). Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods
wilcox.test
, on which this function is based.
combineMarkers
, to combine pairwise comparisons into a single DataFrame per group.
getTopMarkers
, to obtain the top markers from each pairwise comparison.
library(scuttle)
sce <- mockSCE()
sce <- logNormCounts(sce)
# Any clustering method is okay.
kout <- kmeans(t(logcounts(sce)), centers=2)
# Vanilla application:
out <- pairwiseWilcox(logcounts(sce), groups=kout$cluster)
out
# Directional and with a minimum log-fold change:
out <- pairwiseWilcox(logcounts(sce), groups=kout$cluster,
direction="up", lfc=0.2)
out
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.