autofilter | R Documentation |
Apply simple cutoffs and discover data-driven thresholds for poor quality cells in scRNAseq.
autofilter( sobj, min_num_UMI, min_num_Feature, max_perc_mito, max_perc_hemoglobin, loess_negative_residual_threshold, mad.score.threshold, globalfilter.complexity, globalfilter.mito, globalfilter.libsize )
sobj |
seurat object |
min_num_UMI |
numeric, default is 1000, if no filter is desired set to -Inf |
min_num_Feature |
numeric, default is 200, if no filter is desired set to -Inf |
max_perc_mito |
numeric, default is 25, if no filter is desired set to Inf |
max_perc_hemoglobin |
numeric, default is 25, if no filter is desired set to Inf |
loess_negative_residual_threshold |
numeric, cutoff for loess residuals applied in complexity filtering, default is -3, if you set it high (ie any higher than -2) you will probably remove many good cells. |
mad.score.threshold |
numeric, default is 2.5, threshold for median abs deviation thresholding, ie cutoffs set to |
globalfilter.complexity |
T/F, default T, whether to filter cells with lower than expected number of genes given number of UMIs |
globalfilter.mito |
T/F, default T, whether to filter cells with higher than normal mito content |
globalfilter.libsize |
T/F, default T, whether to filter cells with lower than normal UMI content |
Simple cutoffs include minimum number of UMIs, minimum number of unique genes detected, maximum percent mito, and maximum percent hemoglobin. More complex cutoffs are learnt for lower than expected complexity (defined for each cell as num unique genes / num UMIs). Additionally, median absolute deviation is used to exclude remaining cells with high mito content or low UMI content.
Specifically, for complexity, a two-part model is used to model log(num Genes) ~ log(num UMIs) for each cell. A linear model and a Loess model are both set up in this way. Outliers with low complexity are called as cells with > 4/n cooks distance cells in the linear model, and low residuals in the loess model. The residual cutoff is set to -3 by default, capturing very low complexity outlier cells.
a list object.
'cellstatus' = data.frame with cells, filtered out (T/F), filter reason, and other information.
'filtersummary' = small data.frame summarizing the cellstatus$filterreason
information.
'allcommands' = commands passed to the autofilter function
'baseline_qc_summary' = summarizes distributions of key QC variables
'globalfilter.complexity' = summarizes the complexity filtering with plots and number cells removed
'globalfilter.libsize' = summarizes the libsize filtering with plots and number cells removed
'globalfilter.mito' = summarizes the mito filtering with plots and number cells removed
# identify outliers af <- autofilter(sobj) # remove outliers goodcells <- af$cellstatus[af$cellstatus$filteredout==F,"barcodes"] sobj <- sobj[,goodcells]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.