knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) if (.Platform$OS.type == "windows") knitr::opts_chunk$set(crop=NULL)
CITE-seq data provide RNA and surface protein counts for the same cells. There are different workflows to analyse these data in R such as with Seurat or with CiteFuse. This tutorial shows how such data stored in MuData (H5MU) files can be read and integrated with Seurat-based workflows.
The most recent MuDataSeurat
build can be installed from GitHub:
library(remotes) remotes::install_github("PMBio/MuDataSeurat")
library(MuDataSeurat) library(Seurat)
For this tutorial, we will use a subset of cells and features from the output of the CITE-seq integration in muon.
This example dataset can be downloaded as a file in the H5MU format, which is then deserialised to create a Seurat object:
# Download file into a temporary directory fdir <- tempdir() mdata_path <- file.path(fdir, "minipbcite.h5mu") download.file(url="https://github.com/gtca/h5xx-datasets/blob/main/datasets/minipbcite.h5mu?raw=true", destfile=mdata_path, mode="wb") pbmc <- ReadH5MU(mdata_path) pbmc
Here, pbmc
is a full-featured Seurat object that can be used in downstream analysis workflows.
Importantly, data can now be plotted with Seurat.
Idents(pbmc) <- "celltype" DimPlot(pbmc, reduction = "WNN_UMAP")
Cells can be coloured by expression level and plotted in a selected latent space:
DefaultAssay(pbmc) <- "rna" DimPlot(pbmc, reduction = "MOFA_UMAP", label = TRUE, repel = TRUE) + NoLegend() + FeaturePlot(pbmc, features = "MS4A1", reduction = "MOFA_UMAP")
RNA expression in different celltypes (genes in the subset were pre-selected to be celltype markers):
rna_features <- c("IRF8", "FCGR3A", "CD14", "MS4A1", "IGHD", "KLRC2", "NKG7", "CCL5", "CD8B", "IL2RA", "IL7R") DotPlot(pbmc, features = rna_features) + RotatedAxis()
Naïve/memory T cell surface protein expression:
DefaultAssay(pbmc) <- "prot" prot_features <- c("CD3-TotalSeqB", "CD45RA-TotalSeqB", "CD45RO-TotalSeqB") t_cells <- c("CD4+ naïve T", "CD4+ memory T", "Treg", "CD8+ naïve T", "CD8+ memory T") VlnPlot(pbmc[,pbmc@meta.data$celltype %in% t_cells], features = prot_features, ncol = 3)
Since pbmc
is a Seurat object, we can use corresponding processing and analysis functions. To provide an example, we will re-calculate cell neighbourhoods using the previously computed (on the full dataset) MOFA factors and then compute a non-linear embedding of cells.
pbmc <- FindNeighbors(pbmc, reduction = "MOFA", dims = 1:30) pbmc <- RunUMAP(pbmc, reduction = "MOFA", dims = 1:30, reduction.key = "rUMAP_", reduction.name = "rUMAP") DimPlot(pbmc, reduction = "rUMAP")
mudata (Python) documentation
muon documentation and tutorials
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.