getTopMarkers: Get top markers
In scran: Methods for Single-Cell RNA-Seq Data Analysis

Description Usage Arguments Details Value Author(s) See Also Examples

Obtain the top markers for each pairwise comparison between clusters, or for each cluster.

getTopMarkers(
  de.lists,
  pairs,
  n = 10,
  pval.field = "p.value",
  fdr.field = "FDR",
  pairwise = TRUE,
  pval.type = c("any", "some", "all"),
  fdr.threshold = 0.05,
  ...
)

`de.lists`	A list-like object where each element is a data.frame or DataFrame. Each element should represent the results of a pairwise comparison between two groups/clusters, in which each row should contain the statistics for a single gene/feature. Rows should be named by the feature name in the same order for all elements.
`pairs`	A matrix, data.frame or DataFrame with two columns and number of rows equal to the length of `de.lists`. Each row should specify the pair of clusters being compared for the corresponding element of `de.lists`.
`n`	Integer scalar specifying the number of markers to obtain from each pairwise comparison, if `pairwise=FALSE`. Otherwise, the number of top genes to take from each cluster's combined marker set, see Details.
`pval.field`	String specifying the column of each DataFrame in `de.lists` to use to identify top markers. Smaller values are assigned higher rank.
`fdr.field`	String specifying the column containing the adjusted p-values.
`pairwise`	Logical scalar indicating whether top markers should be returned for every pairwise comparison. If `FALSE`, one marker set is returned for every cluster.
`pval.type`	String specifying how markers from pairwise comparisons are to be combined if `pairwise=FALSE`. This has the same effect as `pval.type` in `combineMarkers`.
`fdr.threshold`	Numeric scalar specifying the FDR threshold for filtering. If `NULL`, no filtering is performed on the FDR.
`...`	Further arguments to pass to `combineMarkers` if `pairwise=FALSE`.

This is a convenience utility that converts the results of pairwise comparisons into a marker list that can be used in downstream functions, e.g., as the marker sets in SingleR. By default, it returns a list of lists containing the top genes for every pairwise comparison, which is useful for feature selection to select genes distinguishing between closely related clusters. The top n genes are chosen with adjusted p-values below fdr.threshold.

If pairwise=FALSE, combineMarkers is called on de.lists and pairs to obtain a per-cluster ranking of genes from all pairwise comparisons involving that cluster. If pval.type="any", the top genes with Top values no greater than n are retained; this is equivalent to taking the union of the top n genes from each pairwise comparison for each cluster. Otherwise, the top n genes with the smallest p-values are retained. In both cases, genes are further filtered by fdr.threshold.

If pairwise=TRUE, a List of Lists of character vectors is returned. Each element of the outer list corresponds to cluster X, each element of the inner list corresponds to another cluster Y, and each character vector specifies the marker genes that distinguish X from Y.

If pairwise=FALSE, a List of character vectors is returned. Each character vector contains the marker genes that distinguish X from any, some or all other clusters, depending on combine.type.

Aaron Lun

pairwiseTTests and friends, to obtain de.lists and pairs.

combineMarkers, for another function that consolidates pairwise DE comparisons.

library(scuttle)
sce <- mockSCE()
sce <- logNormCounts(sce)

# Any clustering method is okay.
kout <- kmeans(t(logcounts(sce)), centers=3) 

out <- pairwiseTTests(logcounts(sce), 
     groups=paste0("Cluster", kout$cluster))

# Getting top pairwise markers:
top <- getTopMarkers(out$statistics, out$pairs)
top[[1]]
top[[1]][[2]]

# Getting top per-cluster markers:
top <- getTopMarkers(out$statistics, out$pairs, pairwise=FALSE)
top[[1]]