Description Usage Arguments Details Value References Examples
Finds markers (differentially expressed genes) for identity classes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | FindMarkers(object, ...)
## Default S3 method:
FindMarkers(
object,
slot = "data",
counts = numeric(),
cells.1 = NULL,
cells.2 = NULL,
features = NULL,
reduction = NULL,
logfc.threshold = 0.25,
test.use = "wilcox",
min.pct = 0.1,
min.diff.pct = -Inf,
verbose = TRUE,
only.pos = FALSE,
max.cells.per.ident = Inf,
random.seed = 1,
latent.vars = NULL,
min.cells.feature = 3,
min.cells.group = 3,
pseudocount.use = 1,
...
)
## S3 method for class 'Seurat'
FindMarkers(
object,
ident.1 = NULL,
ident.2 = NULL,
group.by = NULL,
subset.ident = NULL,
assay = NULL,
slot = "data",
reduction = NULL,
features = NULL,
logfc.threshold = 0.25,
test.use = "wilcox",
min.pct = 0.1,
min.diff.pct = -Inf,
verbose = TRUE,
only.pos = FALSE,
max.cells.per.ident = Inf,
random.seed = 1,
latent.vars = NULL,
min.cells.feature = 3,
min.cells.group = 3,
pseudocount.use = 1,
...
)
|
object |
An object |
... |
Arguments passed to other methods and to specific DE methods |
slot |
Slot to pull data from; note that if |
counts |
Count matrix if using scale.data for DE tests. This is used for computing pct.1 and pct.2 and for filtering features based on fraction expressing |
cells.1 |
Vector of cell names belonging to group 1 |
cells.2 |
Vector of cell names belonging to group 2 |
features |
Genes to test. Default is to use all genes |
reduction |
Reduction to use in differential expression testing - will test for DE on cell embeddings |
logfc.threshold |
Limit testing to genes which show, on average, at least X-fold difference (log-scale) between the two groups of cells. Default is 0.25 Increasing logfc.threshold speeds up the function, but can miss weaker signals. |
test.use |
Denotes which test to use. Available options are:
|
min.pct |
only test genes that are detected in a minimum fraction of min.pct cells in either of the two populations. Meant to speed up the function by not testing genes that are very infrequently expressed. Default is 0.1 |
min.diff.pct |
only test genes that show a minimum difference in the fraction of detection between the two groups. Set to -Inf by default |
verbose |
Print a progress bar once expression testing begins |
only.pos |
Only return positive markers (FALSE by default) |
max.cells.per.ident |
Down sample each identity class to a max number. Default is no downsampling. Not activated by default (set to Inf) |
random.seed |
Random seed for downsampling |
latent.vars |
Variables to test, used only when |
min.cells.feature |
Minimum number of cells expressing the feature in at least one of the two groups, currently only used for poisson and negative binomial tests |
min.cells.group |
Minimum number of cells in one of the groups |
pseudocount.use |
Pseudocount to add to averaged expression values when calculating logFC. 1 by default. |
ident.1 |
Identity class to define markers for; pass an object of class
|
ident.2 |
A second identity class for comparison; if |
group.by |
Regroup cells into a different identity class prior to performing differential expression (see example) |
subset.ident |
Subset a particular identity class prior to regrouping. Only relevant if group.by is set (see example) |
assay |
Assay to use in differential expression testing |
p-value adjustment is performed using bonferroni correction based on the total number of genes in the dataset. Other correction methods are not recommended, as Seurat pre-filters genes using the arguments above, reducing the number of tests performed. Lastly, as Aaron Lun has pointed out, p-values should be interpreted cautiously, as the genes used for clustering are the same genes tested for differential expression.
data.frame with a ranked list of putative markers as rows, and associated
statistics as columns (p-values, ROC score, etc., depending on the test used (test.use
)). The following columns are always present:
avg_logFC
: log fold-chage of the average expression between the two groups. Positive values indicate that the gene is more highly expressed in the first group
pct.1
: The percentage of cells where the gene is detected in the first group
pct.2
: The percentage of cells where the gene is detected in the second group
p_val_adj
: Adjusted p-value, based on bonferroni correction using all genes in the dataset
McDavid A, Finak G, Chattopadyay PK, et al. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics. 2013;29(4):461-467. doi:10.1093/bioinformatics/bts714
Trapnell C, et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology volume 32, pages 381-386 (2014)
Andrew McDavid, Greg Finak and Masanao Yajima (2017). MAST: Model-based Analysis of Single Cell Transcriptomics. R package version 1.2.1. https://github.com/RGLab/MAST/
Love MI, Huber W and Anders S (2014). "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2." Genome Biology. https://bioconductor.org/packages/release/bioc/html/DESeq2.html
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Find markers for cluster 2
markers <- FindMarkers(object = pbmc_small, ident.1 = 2)
head(x = markers)
# Take all cells in cluster 2, and find markers that separate cells in the 'g1' group (metadata
# variable 'group')
suppressWarnings(markers <- FindMarkers(pbmc_small, ident.1 = "g1", group.by = 'groups', subset.ident = "2"))
head(x = markers)
# Pass 'clustertree' or an object of class phylo to ident.1 and
# a node to ident.2 as a replacement for FindMarkersNode
# pbmc_small <- BuildClusterTree(object = pbmc_small)
# markers <- FindMarkers(object = pbmc_small, ident.1 = 'clustertree', ident.2 = 5)
# head(x = markers)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.