knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)

if (.Platform$OS.type == "windows")
    knitr::opts_chunk$set(crop=NULL)

Introduction

CITE-seq data provide RNA and surface protein counts for the same cells. There are different workflows to analyse these data in R such as with Seurat or with CiteFuse. This tutorial shows how such data stored in MuData (H5MU) files can be read and integrated with Seurat-based workflows.

Installation

The most recent MuDataSeurat build can be installed from GitHub:

library(remotes)
remotes::install_github("PMBio/MuDataSeurat")

Loading libraries

library(MuDataSeurat)
library(Seurat)

Loading data

For this tutorial, we will use a subset of cells and features from the output of the CITE-seq integration in muon.

This example dataset can be downloaded as a file in the H5MU format, which is then deserialised to create a Seurat object:

# Download file into a temporary directory
fdir <- tempdir()
mdata_path <- file.path(fdir, "minipbcite.h5mu")
download.file(url="https://github.com/gtca/h5xx-datasets/blob/main/datasets/minipbcite.h5mu?raw=true", destfile=mdata_path, mode="wb")

pbmc <- ReadH5MU(mdata_path)
pbmc

Here, pbmc is a full-featured Seurat object that can be used in downstream analysis workflows.

Visualising data

Importantly, data can now be plotted with Seurat.

Idents(pbmc) <- "celltype"
DimPlot(pbmc, reduction = "WNN_UMAP")

Markers

Cells can be coloured by expression level and plotted in a selected latent space:

DefaultAssay(pbmc) <- "rna"
DimPlot(pbmc, reduction = "MOFA_UMAP", label = TRUE, repel = TRUE) + NoLegend() + 
FeaturePlot(pbmc, features = "MS4A1", reduction = "MOFA_UMAP")

RNA expression in different celltypes (genes in the subset were pre-selected to be celltype markers):

rna_features <- c("IRF8", "FCGR3A", "CD14", "MS4A1", "IGHD", 
                  "KLRC2", "NKG7", "CCL5", "CD8B", "IL2RA", "IL7R")
DotPlot(pbmc, features = rna_features) + RotatedAxis()

Naïve/memory T cell surface protein expression:

DefaultAssay(pbmc) <- "prot"
prot_features <- c("CD3-TotalSeqB", "CD45RA-TotalSeqB", "CD45RO-TotalSeqB")
t_cells <- c("CD4+ naïve T", "CD4+ memory T", "Treg", 
             "CD8+ naïve T", "CD8+ memory T")
VlnPlot(pbmc[,pbmc@meta.data$celltype %in% t_cells], 
        features = prot_features, ncol = 3)

Downstream analysis

Since pbmc is a Seurat object, we can use corresponding processing and analysis functions. To provide an example, we will re-calculate cell neighbourhoods using the previously computed (on the full dataset) MOFA factors and then compute a non-linear embedding of cells.

pbmc <- FindNeighbors(pbmc, reduction = "MOFA", dims = 1:30)

pbmc <- RunUMAP(pbmc, reduction = "MOFA", dims = 1:30, reduction.key = "rUMAP_", reduction.name = "rUMAP")

DimPlot(pbmc, reduction = "rUMAP")

References

Session Info

sessionInfo()


PMBio/MuDataSeurat documentation built on Nov. 21, 2023, 7:31 a.m.