Here we will illustrate the first Scriabin workflow: generating and analyzing a full dataset cell-cell interaction matrix. We choose the panc8 dataset of pancreas islets that can be easily installed via SeuratData.
Because one dimension of the cell-cell interaction matrix is the (number of cells in the dataset)^2, it is impractical to generate and analyze this matrix for medium-to-large datasets. In these situations, we recommend trying the other two workflows of Scriabin. Namely, identification of cells whose magnitude of communication
Load libraries
library(Seurat) library(SeuratData) library(scriabin) library(tidyverse)
To install the panc8 dataset:
if (!requireNamespace("panc8.SeuratData", quietly = TRUE)) install.packages("https://seurat.nygenome.org/src/contrib/panc8.SeuratData_3.0.2.tar.gz", repos = NULL, type = "source") library(panc8.SeuratData) panc8 <- LoadData("panc8")
Analyze the fluidigmc1 dataset for the single dataset tutorial
panc_fl <- subset(panc8, cells = colnames(panc8)[panc8$tech=="fluidigmc1"]) panc_fl <- SCTransform(panc_fl, verbose = F) %>% RunPCA(verbose = F) %>% RunUMAP(dims = 1:30, verbose = F) DimPlot(panc_fl, group.by = "celltype", label = T, repel = T) + NoLegend()
We have shown that the sparsity and technical noise characteristic of many scRNA-seq platforms is propagated into the CCIM and can result in less robust single-cell CCC inferences. Therefore, when using data from sparser platforms, like common droplet-based methods, as well as non-UMI methods that are more likely to be zero inflated, we highly recommend applying denoising techniques to your data, for example, using ALRA which is available in SeuratWrappers
. The underlying data used in this vignette (from the Fluidigm C1 platform) is not subject to these considerations as it is several orders of magnitude less sparse than droplet-based methods, and we proceed without applying denoising algorithms.
Let's comprehensively examine the communicative heterogeneity that exists between beta cells as senders and all other cell types as receivers.
First we generate a cell-cell interaction matrix object. Then we scale, find variable features, run PCA and UMAP, find neighbor graphs and clusters, treating this matrix just like we would a gene expression matrix.
panc_fl_ccim <- GenerateCCIM(panc_fl, senders = colnames(panc_fl)[panc_fl$celltype=="beta"], receivers = colnames(panc_fl)[panc_fl$celltype %notin% "beta"]) panc_fl_ccim <- ScaleData(panc_fl_ccim) %>% FindVariableFeatures() %>% RunPCA() %>% RunUMAP(dims = 1:10) %>% FindNeighbors(dims = 1:10) %>% FindClusters(resolution = 0.2) DimPlot(panc_fl_ccim, group.by = "receiver_celltype") DimPlot(panc_fl_ccim, label = T, repel = T) + NoLegend()
First, there's some interesting heterogeneity here: beta-alpha sender-receiver pairs do not all appear to interact in the same way. Further, we visually see strong separation between cell-cell pairs with different sender-receiver cell type combinations.
Some clusters of cell-cell pairs with the same sender annotation contain receivers of different annotations (eg. cluster 3), perhaps indicating these sender cells use similar patterns of ligand expression to communicate with both cell types.
Next, we can identify what ligand-receptor pairs are differentially expressed between these clusters of CCC by calling FindMarkers. Clusters 3 and 10 look particularly curious, where in both clusters alpha cells are the receivers.
cluster3_10.edges <- FindMarkers(panc_fl_ccim, ident.1 = "3", ident.2 = "10") cluster3_10.edges %>% top_n(40,wt = abs(avg_log2FC))
Scriabin also provides a utility function CCIMFeaturePlot to visualize the expression of ligands or receptors in gene expression space superimposed on cell-cell pairs.
CCIMFeaturePlot(panc_fl_ccim, seu = panc_fl, features = c("EFNA1"), type_plot = "sender") CCIMFeaturePlot(panc_fl_ccim, seu = panc_fl, features = c("RET"), type_plot = "receiver")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.