findMarkers | R Documentation |
Find candidate marker genes for groups of cells (e.g., clusters) by testing for differential expression between pairs of groups.
findMarkers(x, ...)
## S4 method for signature 'ANY'
findMarkers(
x,
groups,
test.type = c("t", "wilcox", "binom"),
...,
pval.type = c("any", "some", "all"),
min.prop = NULL,
log.p = FALSE,
full.stats = FALSE,
sorted = TRUE,
row.data = NULL,
add.summary = FALSE,
BPPARAM = SerialParam()
)
## S4 method for signature 'SummarizedExperiment'
findMarkers(x, ..., assay.type = "logcounts")
## S4 method for signature 'SingleCellExperiment'
findMarkers(x, groups = colLabels(x, onAbsence = "error"), ...)
x |
A numeric matrix-like object of expression values, where each column corresponds to a cell and each row corresponds to an endogenous gene. This is expected to be normalized log-expression values for most tests - see Details. Alternatively, a SummarizedExperiment or SingleCellExperiment object containing such a matrix. |
... |
For the generic, further arguments to pass to specific methods. For the ANY method:
Common arguments for all testing functions include For the SummarizedExperiment method, further arguments to pass to the ANY method. For the SingleCellExperiment method, further arguments to pass to the SummarizedExperiment method. |
groups |
A vector of length equal to |
test.type |
String specifying the type of pairwise test to perform -
a t-test with |
pval.type |
A string specifying how p-values are to be combined across pairwise comparisons for a given group/cluster. |
min.prop |
Numeric scalar specifying the minimum proportion of significant comparisons per gene,
Defaults to 0.5 when |
log.p |
A logical scalar indicating if log-transformed p-values/FDRs should be returned. |
full.stats |
A logical scalar indicating whether all statistics in |
sorted |
Logical scalar indicating whether each output DataFrame should be sorted by a statistic relevant to |
row.data |
A DataFrame containing additional row metadata for each gene in Alternatively, a list containing one such DataFrame per level of |
add.summary |
Logical scalar indicating whether statistics from |
BPPARAM |
A BiocParallelParam object indicating whether and how parallelization should be performed across genes. |
assay.type |
A string specifying which assay values to use, usually |
This function provides a convenience wrapper for marker gene identification between groups of cells,
based on running pairwiseTTests
or related functions and passing the result to combineMarkers
.
All of the arguments above are supplied directly to one of these two functions -
refer to the relevant function's documentation for more details.
If x
contains log-normalized expression values generated with a pseudo-count of 1,
it can be used in any of the pairwise testing procedures.
If x
is scale-normalized but not log-transformed, it can be used with test.type="wilcox"
and test.type="binom"
.
If x
contains raw counts, it can only be used with test.type="binom"
.
Note that log.p
only affects the combined p-values and FDRs.
If full.stats=TRUE
, the p-values for each individual pairwise comparison will always be log-transformed,
regardless of the value of log.p
.
Log-transformed p-values and FDRs are reported using the natural base.
The choice of pval.type
determines whether the highly ranked genes are those that are DE between the current group and:
any other group ("any"
)
all other groups ("all"
)
some other groups ("some"
)
See ?combineMarkers
for more details.
A named list of DataFrames, each of which contains a sorted marker gene list for the corresponding group.
In each DataFrame, the top genes are chosen to enable separation of that group from all other groups.
See ?combineMarkers
for more details on the output format.
If row.data
is provided, the additional fields are added to the front of the DataFrame for each cluster.
If add.summary=TRUE
, extra statistics for each cluster are also computed and added.
Any log-fold changes are reported as differences in average x
between groups
(usually in base 2, depending on the transformation applied to x
).
Aaron Lun
pairwiseTTests
,
pairwiseWilcox
,
pairwiseBinom
,
for the underlying functions that compute the pairwise DE statistics.
combineMarkers
, to combine pairwise statistics into a single marker list per cluster.
summaryMarkerStats
, to incorporate additional summary statistics per cluster.
getMarkerEffects
, to easily extract a matrix of effect sizes from each DataFrame.
library(scuttle)
sce <- mockSCE()
sce <- logNormCounts(sce)
# Any clustering method is okay, only using k-means for convenience.
kout <- kmeans(t(logcounts(sce)), centers=4)
out <- findMarkers(sce, groups=kout$cluster)
names(out)
out[[1]]
# More customization of the tests:
out <- findMarkers(sce, groups=kout$cluster, test.type="wilcox")
out[[1]]
out <- findMarkers(sce, groups=kout$cluster, lfc=1, direction="up")
out[[1]]
out <- findMarkers(sce, groups=kout$cluster, pval.type="all")
out[[1]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.