This vignette shows annotation of PBMC3k dataset from the Seurat tutorial
library(CellAnnotatoR)
library(Seurat)
library(dplyr)
library(ggplot2)
theme_set(theme_bw())
First, read the data. You need to set your path to data here:
pbmc_data <- Read10X("~/mh/Data/10x/pbmc3k/filtered_gene_bc_matrices/hg19/")
pbmc <- CreateSeuratObject(pbmc_data, project="pbmc3k", min.cells=3, min.features=200)
pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5) %>%
NormalizeData(verbose=F) %>% FindVariableFeatures(selection.method = "vst", nfeatures=2000, verbose=F) %>%
ScaleData(verbose=F) %>% RunPCA(features=VariableFeatures(.), verbose=F) %>%
RunTSNE(dims=1:10)
To run annotation we also need to estimate cell neighbor graph, and clustering to improve quality. In general, to be able to detect small populations high clustering resolution is required.
pbmc <- FindNeighbors(pbmc, dims=1:10, verbose=F) %>% FindClusters(resolution=5, verbose=F)
DimPlot(pbmc, reduction = "tsne", label=T) + NoLegend()
We need to extract required information from the Seurat object. Exact fields may vary depending on preprocessing options you used.
cm <- pbmc@assays$RNA@counts
cm_norm <- Matrix::t(pbmc@assays$RNA@data)
graph <- pbmc@graphs$RNA_snn
emb <- pbmc@reductions$tsne@cell.embeddings
clusters <- setNames(pbmc@meta.data$seurat_clusters, rownames(pbmc@meta.data))
Now we can run annotation:
marker_path <- "../markers/pbmc.md"
clf_data <- getClassificationData(cm, marker_path)
ann_by_level <- assignCellsByScores(graph, clf_data, clusters=clusters)
And plot the annotation:
Idents(pbmc) <- ann_by_level$annotation$l1
DimPlot(pbmc, reduction = "tsne", label=T) + NoLegend()
Let’s also plot marker expression per cluster to ensure that everything is correct:
plotSubtypeMarkers(emb, cm_norm, parent.type="root", clf.data=clf_data, n.col=3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.