Description Usage Arguments Details Value Author(s) See Also Examples
getClassAUC implements one way to investigate clustering quality. It processes the
output of sortGenes
to obtain a curve for each cell cluster for all
gene specificity scores against their ranking in the cluster. The Area Under
the Curve (AUC) can be used as a measure of clustering quality in terms of the
possibility to identify cell clusters using a few marker genes. See Details.
1 | getClassAUC(gs, markers = NULL, plotCurves = TRUE, colors = NULL)
|
gs |
A list containing |
markers |
A character vector of gene names to restrict this analysis to. See Details. |
plotCurves |
Should a plot be drawn? default value is TRUE. |
colors |
Color palette for the plot. |
Given the specificity score for all genes in a certain cell cluster, we can assume that a well-separated easily-identified cell cluster will have a relatively small number of genes that have a very high specificity score. Top marker genes for a cluster that is poorly separated from other cell clusters will have average or low specificity scores. Sorting the genes for each cell cluster by their specificity scores and plotting the scaled scores in order creates a curve that should be far from the diagonal for well-separated clusters but close to the diagonal for poorly-separated clusters. The AUC of this curve can be used to quantify this intuition and estimate a clustering quality metric.
getClassAUC
returns a numeric vector of length
ncol($specScore)
that contains the AUC for each cell cluster.
Mahmoud M Ibrahim <mmibrahim@pm.me>
getMarkers
returns a cell cluster Shannon index that tends to
correlate well with the AUC metric returned by getClassAUC
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #randomly generated expression matrix and cell clusters
set.seed(1234)
exp = matrix(sample(0:20,1000,replace=TRUE), ncol = 20)
rownames(exp) = sapply(1:50, function(x) paste0("g", x))
cellType = sample(c("cell type 1","cell type 2"),20,replace=TRUE)
sg = sortGenes(exp, cellType)
classAUC = getClassAUC(sg)
#"reasonably" separated clusters
data(sim)
sg = sortGenes(sim$exp, sim$cellType)
classAUC = getClassAUC(sg)
#real data with three well separated clusters
data(kidneyTabulaMuris)
sg = sortGenes(kidneyTabulaMuris$exp, kidneyTabulaMuris$cellType)
classAUC = getClassAUC(sg)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.