| ZetaSuitSC | R Documentation |
This function evaluates the quality of cells detected in single-cell RNA-seq data by calculating a zeta score for each cell. The zeta score is based on the distribution of gene expression across different expression thresholds. A cutoff value is automatically determined using a two-component Gaussian mixture model to separate high-quality cells from low-quality or damaged cells.
ZetaSuitSC(countMatSC, binNum = 10, filter = TRUE)
countMatSC |
A matrix of single-cell RNA-seq count data where rows represent cells and columns represent genes. |
binNum |
The number of bins for zeta score calculation. Default is 10. The function creates expression thresholds from 0 to the 80th percentile of non-zero expression values, divided into binNum intervals. |
filter |
Logical. Whether to filter out cells with total read counts less than 100. Default is TRUE. This helps remove extremely low-quality cells before analysis. |
The function works as follows:
Filters cells based on total read count if filter=TRUE
Samples a subset of cells and genes for computational efficiency
Creates expression thresholds (bins) from 0 to the 80th percentile of non-zero expression values
For each cell, counts how many genes exceed each threshold
Calculates the zeta score as a weighted sum of these counts
Fits a two-component Gaussian mixture model to log10-transformed zeta scores
Determines an optimal cutoff to separate high-quality from low-quality cells
A list containing:
zetaData |
A data frame with two columns: 'Cell' (cell identifiers) and 'Zeta' (calculated zeta scores) |
p_cutoff |
A ggplot object showing the distribution of log10-transformed zeta scores with fitted Gaussian mixture components and the determined cutoff threshold |
Yajing Hao, Shuyang Zhang, Junhui Li, Guofeng Zhao, Xiang-Dong Fu
data(countMatSC)
zetaDataSC <- ZetaSuitSC(countMatSC, binNum=50, filter=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.