knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
knitr::include_graphics("citeviz-logo-light-cropped.png")

Installation

Make sure you have R version 4.2.0 or later installed. We recommend installing CITEViz through the RStudio IDE using the following commands:

devtools::install_github("https://github.com/maxsonBraunLab/CITEViz.git")

Data Input Format

CITEViz accepts files in the RDS (.rds) format. The preferred RDS file should include a Seurat object or a SingleCellExperiment object. If a list of a single Seurat object is used, only the object labeled "integrated" will be used.

The input Seurat or SingleCellExperiment object must contain cell embeddings data for at least one dimensional reduction method (e.g. PCA, UMAP, tSNE, etc.). If the object contains data for more than one reduction, each reduction must have a unique name. For a SingleCellExperiment object, the main experiment and any alternative experiments in the SingleCellExperiment object must contain assay count data stored as "logcounts" (log-transformed, normalized counts) or "normcounts" (normalized counts). Please refer to the SingleCellExperiment reference manual on Bioconductor for help on how to determine the types of count data present in your SCE object. By default, CITEViz will use the logcounts for plotting expression data. If logcounts are not available, then CITEViz will use normcounts. If using a SingleCellExperiment object that was generated from Seurat-processed data or converted from a Seurat object, then the logcounts in the SingleCellExperiment object are analogous to the counts in the "data" slot for a given assay in a Seurat object.

Sample Data

In this vignette, we used a downsampled version of the PBMC dataset from Hao et. al. 2021. The sample PBMC dataset for this vignette can be downloaded as a 10K cell subset from Google Drive here.

Running CITEViz

To run CITEViz, use the following commands:

# load package
library(CITEViz)

# run CITEViz
run_app()

Uploading Data into CITEViz

After calling run_app() to start CITEViz, a file can be uploaded using the file upload box at the top of any page in CITEViz. After clicking the file upload box, a file explorer window containing files in the user's local file system will open. The user can select an RDS file containing CITE-seq data to upload, and the same dataset will be retained for exploration across all tabs.

knitr::include_graphics("file_upload_box.png")

Quality Control

The first step of analysis is to assess the quality of sequencing data. Here we provide QC plots that display data for common metrics such as gene or antibody-derived tag (ADT) counts per assay, number of unique ADTs, and mitochondrial expression, which can be visualized by any categorical metadata in the user’s Seurat or SingleCellExperiment object. Dotted lines in each QC plot represent the values at which 50%, 75%, and 95% of the data falls at or below that value.

knitr::include_graphics("qc-unique-adt.png")

Clustering

The Clustering page allows the user to view cell clusters in two- and three-dimensional space. These clusters can be colored by any categorical metadata, and the user can select dimensionality reductions (e.g., UMAP, PCA, etc.) to view from a dropdown menu.

knitr::include_graphics("clustering_page.png")

When the user’s cursor hovers over the 2D reduction plot, a Plotly toolbox with labeled options will appear. From this toolbox, the user can zoom, pan, download, and reset a plot by selecting an option. The user can also use the box or lasso selection tool to select specific cells in a plot. The metadata for selected cells appears in the scrollable, interactive data table below the plots, and the user can print or copy this data to their clipboard.

Feature Expression

CITEViz supports RNA and ADT feature expression visualizations in both one-dimensional and two-dimensional formats. On the feature expression tabs of CITEViz, cell clusters are displayed in a dimensional reduction plot. These cell clusters can be colored by expression levels of selected RNA and/or ADT features. Similar to the clustering page, the user can select the type of dimensional reduction to view.

One-Dimensional (1D) Feature Expression

In 1D feature expression, cells in a dimensional reduction plot are colored by expression levels for one RNA or ADT feature. The user can select a specific RNA or ADT feature from a dropdown menu.

knitr::include_graphics("feature_expression_CD8.png")

Two-Dimensional (2D) Feature Expression

Co-expression of features greatly facilitates a holistic view of single-cell multi-omic datasets. In 2D feature expression, cells in a dimensional reduction plot are colored by expression levels for two gene and/or ADT features simultaneously. The user can select specific gene and/or ADT features from dropdown menus. CITEViz can visualize co-expression of features from the same assay (i.e. two genes or two ADTs), as shown below, or two features from different assays (i.e. one gene and one ADT).

knitr::include_graphics("feature_expression_CD14_CD16.png")

Gating

A key feature of CITEViz is that users can iteratively gate (subset) cells using antibody markers, and the selected cells are immediately highlighted in the dimensional reduction space (e.g. UMAP, PCA, tSNE, etc.). Users can subset cells using one or more gates to explore specific cell populations similar to flow cytometry.

General Workflow

  1. Select plot settings: From dropdown menus in the sidebar panel, the user can select an assay (e.g. ADT) and two assay features (e.g. CD4, CD8) for which to plot normalized expression levels. The user can also select a reduction (e.g. UMAP) for which to plot dimensional reduction embeddings, and can color the cells in the reduction plot by a discrete metadata category (e.g. cell type).
knitr::include_graphics("gating_dropdown_menus.png")
  1. Select cells: When the user’s cursor hovers over each plot on the gating page, a Plotly toolbox with labeled options will appear. From this toolbox, the user can choose a cell selection tool, zoom, pan, download, or reset a plot by selecting an option. The user can use the box or lasso selection tool to select specific cells in the feature scatter plot. When cells in the feature scatter plot are selected, the corresponding cells in the reduction plot will be highlighted.
  2. Label the selected cells: To input a name for a subset of cells to be gated, use the text-input field of the sidebar panel prior to clicking the gate button.
  3. Click the gate button: After gating a subset of cells, information about the newly-created gate will be displayed in a scrollable, interactive data table below the plots.
  4. Label the selected cells: To input a name for a subset of cells to be gated, use the text-input field of the sidebar panel prior to clicking the gate button.
  5. Click the gate button: After gating a subset of cells, information about the newly-created gate will be displayed in a scrollable, interactive data table below the plots.

Additional Features

knitr::include_graphics("gate_datatable_renaming.png")

Example

In the following example, we demonstrate a 2-layer gate to isolate cells with specific levels of CD11b-1 and CD45-1 (mixture of myeloid and lymphoid cells) from the whole cell population, followed by the isolation of CD8+ CD4- cells from this first subset to view the CD8+ T-cell subpopulation:

  1. Select cells with relatively low levels of CD11b-1 and CD45-1.
knitr::include_graphics("gating_CD8_T_Cells.png")
  1. From the cells in the first gate, select cells with low levels of CD4 and high levels of CD8.
knitr::include_graphics("gating_CD8_T_Cells2.png")

Exporting Data from CITEViz

CITEViz uses interactive data visualization and exploration packages such as Plotly and DT. Plotly plots can be exported as PNG files, and DT datatables can be copied, printed, or exported as files for downstream analysis.

After gating for cell populations of interest (see [Gating][Gating] section of this vignette), the metadata for each gate is saved in an internal Gate object. These metadata include cell barcodes that can facilitate downstream analysis such as differential gene expression. To download this gating data as a list of gates in RDS format, the user can click the "List (.rds)" button at the bottom of the gating page.

Back-Gating

Users can back-gate on a selection of cells in a reduction plot and highlight them in feature space. On the back-gating page, users can highlight cells in feature space from a “labels-first” or “top-down” workflow. For example, users can select a range of cells from the reduction plot (e.g. UMAP, PCA etc.) and locate them in the feature scatter plot for more extensive data exploration.

General Workflow

Plot settings, cell population selection and naming methods are analogous to the gating page. However, back-gating requires the selection of a region of interest in the dimension reduction space. These cells will be highlighted in the feature scatter plot, where different the axes can be configured to the user's preferences.

Example

In the following example, we back-gate on a selection of B-cells in the WNN.UMAP reduction plot. These selected cells are then highlighted in black in the feature scatter plot.

knitr::include_graphics("backgate-b-cells.png")

Gated monocytes followed by differential expression

The gating data obtained from CITEViz can be read back into Seurat to facilitate differential expression. In the following example, we find differentially expressed genes between gated CD14 and CD16 monocytes. Gated cells can be found in the following Google Drive link.

library(Seurat)

# import original data
pbmc = readRDS("~/Downloads/small_pbmc.rds")

# import gate information
cd14_gate = readRDS("~/Downloads/CD14-Monocytes.rds")
cd16_gate = readRDS("~/Downloads/CD16-Monocytes.rds")

# extract cell barcodes in each gate
cd14_barcodes <- cd14_gate$gate_1@subset_cells[[1]]
cd16_barcodes <- cd16_gate$gate_1@subset_cells[[1]]
# make sure no overlapping cells between gates
cd14_barcodes <- setdiff(cd14_barcodes, cd16_barcodes)

# differential expression
diff_exp <- FindMarkers(pbmc, cells.1 = cd14_barcodes, cells.2 = cd16_barcodes)

diff_exp will contain results like the following:

| p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | |----------------------|------------|-------|-------|---------------| | FCGR3A 3.551593e-133 | -3.4302631 | 0.157 | 0.907 | 7.362098e-129 | | CDKN1C 5.972331e-123 | -2.6071919 | 0.088 | 0.729 | 1.238005e-118 | | HES4 1.920188e-107 | -1.4098503 | 0.083 | 0.682 | 3.980357e-103 | | CKB 6.424118e-102 | -0.8753508 | 0.018 | 0.372 | 1.331655e-97 | | RHOC 7.989362e-95 | -1.5387006 | 0.139 | 0.775 | 1.656115e-90 | | ADA 1.289123e-93 | -0.8313632 | 0.072 | 0.612 | 2.672222e-89 | | PTP4A3 3.844392e-82 | -0.6077669 | 0.049 | 0.496 | 7.969039e-78 | | MS4A4A 1.986837e-74 | -1.0672453 | 0.124 | 0.682 | 4.118514e-70 | | CD79B 6.349774e-69 | -0.8747941 | 0.075 | 0.527 | 1.316245e-64 | | CTSL 3.869449e-66 | -1.0800221 | 0.150 | 0.690 | 8.020980e-62 |

Gating metadata for each monocytic gate (gate counter, upstream gates, x and y selection coordinates, etc.) are found in the cd14_barcodes and cd16_barcodes variables. While differential expression may be a relatively simple example to re-use gated cells, users can use CITEViz to facilitate more sophisticated downstream analyses e.g. batch correction or pseudo-time inference of specific cell types of interest.

Session Info

sessionInfo()


maxsonBraunLab/CITE-Viz documentation built on Oct. 26, 2023, 9:52 p.m.