UCell: UCell: Robust and scalable single-cell gene signature scoring

UCellR Documentation

UCell: Robust and scalable single-cell gene signature scoring

Description

UCell is an R package for scoring gene signatures in single-cell datasets. UCell scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands relatively less computing time and memory than most other methods, enabling the processing of large datasets (> 10^5 cells). UCell can be applied to any cell vs. gene data matrix, and includes functions to directly interact with Seurat and SingleCellExperiment objects.

UCell functions

  • ScoreSignatures_UCell Calculate module enrichment scores from single-cell data. Given a gene vs. cell matrix (either as sparse matrix or stored in a SingleCellExperiment object), it calculates module/signature enrichment scores. This score depends only on the gene activity ranks of individual cell, and therefore is robust across datasets.

  • AddModuleScore_UCell A wrapper for UCell to interact directly with Seurat objects. Given a Seurat object and a set of signatures, it calculates enrichment scores on single-cell level and returns them into the meta.data of the input Seurat object.

  • StoreRankings_UCell Calculates and stores gene rankings for a single-cell dataset. Given a gene vs. cell matrix and a set of signatures, it calculates the rankings of expression for all genes in each cell. It can then be applied to the function ScoreSignatures_UCell to evaluate gene signatures on the gene expression ranks of individual cells.

  • SmoothKNN Perform signature score smoothing using a weighted average of the scores of the first k nearest neighbors (kNN). It can be useful to 'impute' scores by neighboring cells and partially correct data sparsity. While this function has been designed to smooth UCell scores, it can be applied to any numerical metadata contained in SingleCellExperiment or Seurat objects

Gene signatures

UCell evaluates the strength of gene signatures (or gene sets) in individual cells of your dataset. You may specify positive and negative (up- or down-regulated) genes in signatures. See the examples below:

markers <- list()
markers$Tcell_CD4 <- c("CD4","CD40LG")
markers$Tcell_CD8 <- c("CD8A","CD8B")
markers$Tcell_Treg <- c("FOXP3","IL2RA")
markers$Tcell_gd <- c("TRDC+", "TRGC1+", "TRGC2+", 
                      "TRDV1+","TRAC-","TRBC1-","TRBC2-")
markers$Tcell_NK <- c("FGFBP2+", "SPON2+", "KLRF1+",
                      "FCGR3A+", "CD3E-","CD3G-")

If you don't specify +/- for genes, they are assumed to be all as a positive set. The UCell score is calculated as:

U = max(0, U^+ - w_{neg} * U^-)

where U^+ and U^- are respectively the UCell scores for the positive and negative set, and w_neg is a weight on the negative set. When no negative set of genes is present, U = U^+

Author(s)

Maintainer: Massimo Andreatta massimo.andreatta@unil.ch (ORCID)

Authors:

References

UCell: robust and scalable single-cell gene signature scoring. Massimo Andreatta & Santiago J Carmona (2021) CSBJ https://doi.org/10.1016/j.csbj.2021.06.043

See Also

Useful links:


carmonalab/UCell documentation built on Nov. 4, 2024, 5:32 p.m.