ClassifyManual: Manual cell type classification by applying gates

View source: R/ClassifyManual.R

ClassifyManualR Documentation

Manual cell type classification by applying gates

Description

This function performs a cell type classification by gating.

Usage

ClassifyManual(
  object,
  gates,
  metadata.name = "celltype",
  assay = DefaultAssay(object),
  layer = "data",
  restrict.ident = "Default",
  restrict.to = as.character(unique(Idents(object))),
  unclassified.name = "None",
  naming.mode = "replace"
)

Arguments

object

Seurat object. Must have metadata columns containing signature scores, such as those generated by ScoreSignatures, with names matching any gating filter supplied.

gates

charater(1). Gate definitions, with gates for various cell types delimited by commas (,). For each gate, the name of the cell type is given first, followed by equal (=) and a series of gates separated by AND symbol (&). Example: "celltype_a = sign_a > value_aa & sign_b < value_ab, celltype_b = sign_b > value_ba & sign_a < value_ab" where sign_a and sign_b are the names of metadata columns and value_xy are numeric values.

metadata.name

The name of the new metadata column where cell type annotations are stored (Default: celltype)

assay

character(1). If some signatures used for classification are stored as assay features, which assay should be used in priority (Default: DefaultAssay(object)).

layer

character(1). If some signatures used for classification are stored as assay features, which layer should be used (Default: "data").

restrict.ident

character(1). The name of a metadata column containing celltype annotations, which will be used to define the subset of cells that should be classified. Default: current Idents(object).

restrict.to

character(n). Which celltypes (as defined in the restrict.ident metadata column) should be classified. Default: All cells.

unclassified.name

character(1). Which name should be given to the cells which do not pass any gate. Can be set to "keep" to keep the existing names for cells which don't pass any gate. Default: "None".

naming.mode

character(1). Should the cell type names be added as a suffix or prefix to existing cell names instead of replacing them? Possible values: "replace", "prefix", "suffix". Default: "replace". When adding as a suffix or prefix, unclassified.name typically needs to be set to empty string "".

Details

Based on cell scores (sign_x, which can be signatures from ScoreSignatures, genetic similarities to reference genotypes, normalized hashtag oligo (HTO) values, gene expression, QC parameters etc), cell types (celltype_a, celltype_b...) are defined based on gates (value_xy cutoffs) of the form:

"celltype_a = sign_a > value_aa & sign_b < value_ab, celltype_b = sign_b > value_ba & sign_a < value_ab ...etc..."

The gating can be done iteratively by restricting to one or several cell types from a given identity (restrict.to and restrict.ident respectively). The gates can also keep existing celltype names and add a suffix or prefix.

Scores are taken from the metadata in priority, otherwise from assays. If found in several assays, define which assay to use in priority with the assay parameter, or to combine several assays prefix the feature name with the name of the assay separated by underscore, as in "assay_feature".

Since single cell data tends to be noisy and have a significant amount of dropouts, imputation is usually a great combination with gating strategies, in a similar fashion to gaussian filtering needed before thresholding in most image analysis workflows. For this, see Impute() and CrossImpute().

Value

A Seurat object with an additional metadata column containing the cell type annotations and Idents() set to these annotations. Cells which do not validate any gate defined are attributed the celltype "None" and cells which pass several gates are labelled "Multiplet".

Examples

# MySeuratObject should contain an RNA layer with log-normalized data assay.
# In this example cells are classified as T cells (TC) or Epithelial cells (EP) based on manual signatures.
# Note that cells that validate no gate will have cell type "None", and cells which validate several will be classified as "Multiplets".
# It's recommended to first do a signature scatter plot in order to choose threshold values for the gates.
MySeuratObject <- Impute(MySeuratObject)
signature.list <- list(TCsign=c("CD3D","CD3E","CD3G","CD4","CD8A","CD8B","GZMA","IFNG"),
                       EPsign=c("EPCAM","TFF3","CDX2","CLDN3","KRT8","KRT19","KRT18","OLFM4"))
MySeuratObject <- ScoreSignatures(MySeuratObject,signature.list)
gates <- "TC = TCsign > 1.2 & EPsign < 0.2,
          EP = EPsign > 1.1 & TCsign < 0.3"
MySeuratObject <- ClassifyManual(MySeuratObject,gates)
DimPlot(MySeuratObject) # Plot the current Idents, which were set to "celltype".
# Subclassify T cells, and store the results in the metadata column "TC_subcelltype" (Could also be used to overwrite the idents of TC in an existing metadata column without altering the identities of other cells).
gates <- "CD8 = CD8A > 0.5 & CD4 < 0.5, CD4 = CD4 > 0.5 & CD8A < 0.5, DoublePositive = CD8A > 0.5 & CD4 > 0.5"
MySeuratObject <- ClassifyManual(MySeuratObject, gates, "TC_subcelltype", restrict.ident="celltype", restrict.to="TC")
# Mark dividing cells by adding a suffix to their cell names:
MySeuratObject <- ClassifyManual(MySeuratObject, gates="Dividing = PCNA > 1 & MKI67 > 1", naming.mode = "suffix", unclassified.name="")

nbroguiere/burgertools documentation built on Jan. 30, 2024, 3:48 a.m.