| classifySex | R Documentation |
This function will predict the sex for each cell in scRNA-seq data. The classifier is based on logistic regression models that have been trained on mouse and human single cell RNA-seq data.
classifySex(x, genome = NULL, qc = TRUE)
x |
counts matrix, rows correspond to genes and columns correspond to cells. Row names must be gene symbols. |
genome |
the genome the data arises from. Current options are human: genome = "Hs" or mouse: genome = "Mm". |
qc |
logical, indicates whether to perform quality control or not. qc = TRUE will predict cells that pass quality control only and the filtered cells will not be classified. qc = FALSE will predict every cell except the cells with zero counts on *XIST/Xist* and the sum of the Y genes. Default is TRUE. |
For bulk RNA-seq, checking the sex of the samples for mouse and human experiments is trivial as we can simply check the expression of *Xist/XIST*. It is not as simple for single cell RNA-seq data as the number of counts measured per gene and per cell is often quite low. Simply relying on cut-offs on the expression of genes like *Xist* means that many cells are unable to be classified. Hence we have developed a classifier based on a combination of X- and Y-linked genes in order to accurately predict the sex of each cell.
Cells with zero counts on Xist and the sum of the Y chromosome genes will
not be classified as there is simply not enough information to accurately
classify as Male/Female, and NAs will be returned. In addition, the user has
the option to perform quality control on the data first, by specifying
qc=TRUE, which will not classify cells that are deemed low-quality.
a dataframe with predicted labels for each cell
Xinyi Jin
library(speckle)
library(SingleCellExperiment)
library(CellBench)
library(org.Hs.eg.db)
sc_data <- load_sc_data()
sc_10x <- sc_data$sc_10x
counts <- counts(sc_10x)
ann <- select(org.Hs.eg.db, keys=rownames(sc_10x),
columns=c("ENSEMBL","SYMBOL"), keytype="ENSEMBL")
m <- match(rownames(counts), ann$ENSEMBL)
rownames(counts) <- ann$SYMBOL[m]
sex <- classifySex(counts, genome="Hs")
table(sex$prediction)
boxplot(counts["XIST",]~sex$prediction)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.