Description Usage Arguments Details Value See Also Examples
View source: R/03_exploreThresholds.R
Plots all the AUC histograms (per gene-set) and calculates several likely thresholds for each gene-set
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | AUCell_exploreThresholds(
cellsAUC,
thrP = 0.01,
nCores = 1,
smallestPopPercent = 0.25,
plotHist = TRUE,
densAdjust = 2,
assignCells = FALSE,
nBreaks = 100,
verbose = TRUE
)
getThresholdSelected(aucellThresholds)
getAssignments(aucellThresholds)
|
cellsAUC |
AUC object returned by |
thrP |
Probability to determine outliers in some of the distributions (see 'details' section). By default it is set to 1% (thrP): if there are 3000 cells in the dataset, it is expected that approximately 30 cells are over this threshold if the AUC is normally distributed. |
nCores |
Number of cores to use for computation. |
smallestPopPercent |
Size (percentage) of the smallest population of cells expected. Used to calculate some of the thresholds. |
plotHist |
Whether to plot the AUC histograms. (TRUE / FALSE) |
densAdjust |
Parameter for the density curve. (See |
assignCells |
Return the list of cells that pass the automatically selected threshold? (TRUE/FALSE) |
nBreaks |
Number of bars to plot in the histograms. |
verbose |
Should the function show progress messages? (TRUE / FALSE) |
aucellThresholds |
For aux functions: Output from AUCell_exploreThresholds |
To ease the selection of an assignment theshold, this function adjusts the AUCs of each gene-set to several distributions and calculates possible thresholds:
minimumDens
(plot in Blue):
Inflection point of the density curve.
This is usually a good option for the ideal situation
with bimodal distributions.
To avoid false positives, by default this threshold will not be chosen if the second distribution is higher (i.e. the majority of cells have the gene-set "active").
L_k2
(plot in Red): Left distribution,
after adjusting the AUC to a mixture of two distributions.
The threshold is set to the right (prob: 1-(thrP/nCells)
).
Only available if 'mixtools' package is installed.
R_k3
(plot in Pink): Right distribution,
after adjusting the AUC to a mixture of three distributions.
The threshold is set to the left (prob: thrP
).
Only available if 'mixtools' package is installed.
Global_k1
(plot in Grey): "global" distribution
(i.e. mean and standard deviations of all cells).
The threshold is set to the right (prob: 1-(thrP/nCells)
).
The threshold based on the global distribution is ignored from the automatic selection unless the mixed models are overlapping.
Note: If assignCells=TRUE
, the highest threshold is used to select cells.
However, keep in mind that this function is only meant to ease the selection
of the threshold, and we highly recommend to look at the AUC histograms and
adjust the threshold manually if needed.
We recommend to be specially aware on gene-sets with few genes (10-15) and
thresholds that are set extremely low.
List with the following elements for each gene-set:
'aucThr' Thresholds calculated with each method (see 'details' section), and the number of cells that would be assigned using that threshold.
If assignCells=TRUE, the threshold selected automatically is the highest value (in most cases, excluding the global distribution).
'assignment' List of cells that pass the selected AUC threshold (if assignCells=TRUE
)
If plotHist=TRUE
the AUC histogram is also plot,
including the distributions calculated and the corresponding thresholds
in the same color (dashed vertical lines).
The threshold that is automatically selected is shown as a thicker
non-dashed vertical line.
Previous step in the workflow: AUCell_calcAUC
.
See the package vignette for examples and more details:
vignette("AUCell")
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | # This example is run using a fake expression matrix.
# Therefore, the output will be meaningless.
############# Fake expression matrix #############
set.seed(123)
exprMatrix <- matrix(data=sample(c(rep(0, 5000), sample(1:3, 5000, replace=TRUE))),
nrow=20,
dimnames=list(paste("Gene", 1:20, sep=""),
paste("Cell", 1:500, sep="")))
dim(exprMatrix)
##################################################
######### Previous steps in the workflow #########
# Step 1.
cells_rankings <- AUCell_buildRankings(exprMatrix, plotStats=FALSE)
# Step 2.
# (Gene sets: random genes)
geneSets <- list(geneSet1=sample(rownames(exprMatrix), 10),
geneSet2=sample(rownames(exprMatrix), 5))
cells_AUC <- AUCell_calcAUC(geneSets, cells_rankings, aucMaxRank=5)
##################################################
############## Step 3: Assign cells ##############
# 1. Plot histograms and obtain some pre-computed thresholds
# (this example is only meant to show the interface/arguments of the function,
# see the vignette for meaningful examples)
set.seed(123)
par(mfrow=c(1,2)) # Plot is divided into one row and two columns
thresholds <- AUCell_exploreThresholds(cells_AUC, plotHist=TRUE)
thresholds$geneSet1$aucThr
# 2. Obtain cells over a given threshold:
names(which(getAUC(cells_AUC)["geneSet1",] > 0.19))
# Alternative: assign cells according to the 'automatic' threshold
cells_assignment <- AUCell_exploreThresholds(cells_AUC,
plotHist=FALSE, assignCells=TRUE)
# Cells assigned:
getAssignments(cells_assignment)
# Threshold applied:
getThresholdSelected(cells_assignment)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.