all_times <- list()  # store the time for each chunk
knitr::knit_hooks$set(time_it = local({
  now <- NULL
  function(before, options) {
    if (before) {
      now <<- Sys.time()
    } else {
      res <- difftime(Sys.time(), now, units = "secs")
      all_times[[options$label]] <<- res
    }
  }
}))
knitr::opts_chunk$set(
  tidy = TRUE,
  tidy.opts = list(width.cutoff = 95),
  message = FALSE,
  warning = FALSE,
  time_it = TRUE,
  error = TRUE
)

Developed in collaboration with the Technology Innovation Group at NYGC, Cell Hashing uses oligo-tagged antibodies against ubiquitously expressed surface proteins to place a "sample barcode" on each single cell, enabling different samples to be multiplexed together and run in a single experiment. For more information, please refer to this paper.

This vignette will give a brief demonstration on how to work with data produced with Cell Hashing in Seurat. Applied to two datasets, we can successfully demultiplex cells to their the original sample-of-origin, and identify cross-sample doublets.

The demultiplexing function `HTODemux()` implements the following procedure:
  • We perform a k-medoid clustering on the normalized HTO values, which initially separates cells into K(# of samples)+1 clusters.
  • We calculate a 'negative' distribution for HTO. For each HTO, we use the cluster with the lowest average value as the negative group.
  • For each HTO, we fit a negative binomial distribution to the negative cluster. We use the 0.99 quantile of this distribution as a threshold.
  • Based on these thresholds, each cell is classified as positive or negative for each HTO.
  • Cells that are positive for more than one HTOs are annotated as doublets.
# 8-HTO dataset from human PBMCs
Dataset description:
  • Data represent peripheral blood mononuclear cells (PBMCs) from eight different donors.
  • Cells from each donor are uniquely labeled, using CD45 as a hashing antibody.
  • Samples were subsequently pooled, and run on a single lane of the the 10X Chromium v2 system.
  • You can download the count matrices for RNA and HTO [here](https://www.dropbox.com/sh/ntc33ium7cg1za1/AAD_8XIDmu4F7lJ-5sp-rGFYa?dl=0), or the FASTQ files from [GEO](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108313)
## Basic setup Load packages wzxhzdk:1 Read in data wzxhzdk:2 Setup Seurat object and add in the HTO data wzxhzdk:3 ## Adding HTO data as an independent assay You can read more about working with multi-modal data [here](multimodal_vignette.html) wzxhzdk:4 ## Demultiplex cells based on HTO enrichment Here we use the Seurat function `HTODemux()` to assign single cells back to their sample origins. wzxhzdk:5 ## Visualize demultiplexing results Output from running `HTODemux()` is saved in the object metadata. We can visualize how many cells are classified as singlets, doublets and negative/ambiguous cells. wzxhzdk:6 Visualize enrichment for selected HTOs with ridge plots wzxhzdk:7 Visualize pairs of HTO signals to confirm mutual exclusivity in singlets wzxhzdk:8 Compare number of UMIs for singlets, doublets and negative cells wzxhzdk:9 Generate a two dimensional tSNE embedding for HTOs.Here we are grouping cells by singlets and doublets for simplicity. wzxhzdk:10 Create an HTO heatmap, based on Figure 1C in the Cell Hashing paper. wzxhzdk:11 Cluster and visualize cells using the usual scRNA-seq workflow, and examine for the potential presence of batch effects. wzxhzdk:12 wzxhzdk:13 # 12-HTO dataset from four human cell lines
Dataset description:
  • Data represent single cells collected from four cell lines: HEK, K562, KG1 and THP1
  • Each cell line was further split into three samples (12 samples in total).
  • Each sample was labeled with a hashing antibody mixture (CD29 and CD45), pooled, and run on a single lane of 10X.
  • Based on this design, we should be able to detect doublets both across and within cell types
  • You can download the count matrices for RNA and HTO [here](https://www.dropbox.com/sh/c5gcjm35nglmvcv/AABGz9VO6gX9bVr5R2qahTZha?dl=0), and are available on GEO [here](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108313)
## Create Seurat object, add HTO data and perform normalization wzxhzdk:14 ## Demultiplex data wzxhzdk:15 ## Visualize demultiplexing results Distribution of selected HTOs grouped by classification, displayed by ridge plots wzxhzdk:16 Visualize HTO signals in a heatmap wzxhzdk:17 ## Visualize RNA clustering
  • Below, we cluster the cells using our standard scRNA-seq workflow. As expected we see four major clusters, corresponding to the cell lines
  • In addition, we see small clusters in between, representing mixed transcriptomes that are correctly annotated as doublets.
  • We also see within-cell type doublets, that are (perhaps unsurprisingly) intermixed with singlets of the same cell type
  • wzxhzdk:18 wzxhzdk:19
    **Session Info** wzxhzdk:20


    satijalab/seurat documentation built on May 11, 2024, 4:04 a.m.