classifySex | R Documentation |
This function will predict the sex for each cell in scRNA-seq data. The classifier is based on logistic regression models that have been trained on mouse and human single cell RNA-seq data.
classifySex(x, genome = NULL, qc = TRUE)
x |
counts matrix, rows correspond to genes and columns correspond to cells. Row names must be gene symbols. |
genome |
the genome the data arises from. Current options are human: genome = "Hs" or mouse: genome = "Mm". |
qc |
logical, indicates whether to perform quality control or not. qc = TRUE will predict cells that pass quality control only and the filtered cells will not be classified. qc = FALSE will predict every cell except the cells with zero counts on *XIST/Xist* and the sum of the Y genes. Default is TRUE. |
For bulk RNA-seq, checking the sex of the samples for mouse and human experiments is trivial as we can simply check the expression of *Xist/XIST*. It is not as simple for single cell RNA-seq data as the number of counts measured per gene and per cell is often quite low. Simply relying on cut-offs on the expression of genes like *Xist* means that many cells are unable to be classified. Hence we have developed a classifier based on a combination of X- and Y-linked genes in order to accurately predict the sex of each cell.
Cells with zero counts on Xist and the sum of the Y chromosome genes will
not be classified as there is simply not enough information to accurately
classify as Male/Female, and NAs will be returned. In addition, the user has
the option to perform quality control on the data first, by specifying
qc=TRUE
, which will not classify cells that are deemed low-quality.
a dataframe with predicted labels for each cell
Xinyi Jin
library(speckle) library(SingleCellExperiment) library(CellBench) library(org.Hs.eg.db) sc_data <- load_sc_data() sc_10x <- sc_data$sc_10x counts <- counts(sc_10x) ann <- select(org.Hs.eg.db, keys=rownames(sc_10x), columns=c("ENSEMBL","SYMBOL"), keytype="ENSEMBL") m <- match(rownames(counts), ann$ENSEMBL) rownames(counts) <- ann$SYMBOL[m] sex <- classifySex(counts, genome="Hs") table(sex$prediction) boxplot(counts["XIST",]~sex$prediction)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.