knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
CITE-seq data provide RNA and surface protein counts for the same cells. This tutorial shows how MuData can be integrated into with Bioconductor workflows to analyse CITE-seq data.
The most recent dev build can be installed from GitHub:
library(remotes) remotes::install_github("ilia-kats/MuData")
Stable version of MuData
will be available in future bioconductor versions.
library(MuData) library(SingleCellExperiment) library(MultiAssayExperiment) library(SingleCellMultiModal) library(scater) library(rhdf5)
We will use CITE-seq data accessible with the
SingleCellMultiModal
Bioconductor package,
which was originally described in
Stoeckius et al., 2017.
mae <- CITEseq( DataType="cord_blood", modes="*", dry.run=FALSE, version="1.0.0" ) mae
We see two modalities in the object — scRNAseq
and scADT
, the latter
providing counts for antibody-derived tags. Notably, each experiment is a matrix.
While CITE-seq analysis workflows such as CiteFuse should be consulted for more details, below we exemplify simple data transformation in order to demonstrate how their output can be saved to an H5MU file later on.
For ADT counts, we will apply CLR transformation following Hao et al., 2020:
# Define CLR transformation as in the Seurat workflow clr <- function(data) t( apply(data, 1, function(x) log1p( x / (exp(sum(log1p(x[x > 0]), na.rm = TRUE) / length(x))) )) )
We will make the ADT modality a SingleCellExperiment
object and add an assay
with CLR-transformed counts:
adt_counts <- mae[["scADT"]] mae[["scADT"]] <- SingleCellExperiment(adt_counts) assay(mae[["scADT"]], "clr") <- clr(adt_counts)
We will also generate reduced dimensions taking advantage of the functionality
in the scater
package:
mae[["scADT"]] <- runPCA( mae[["scADT"]], exprs_values = "clr", ncomponents = 20 )
plotReducedDim(mae[["scADT"]], dimred = "PCA", by_exprs_values = "clr", colour_by = "CD3") plotReducedDim(mae[["scADT"]], dimred = "PCA", by_exprs_values = "clr", colour_by = "CD14")
We can write the contents of the MultiAssayExperiment object into an H5MU file:
writeH5MU(mae, "cord_blood_citeseq.h5mu")
We can check that both modalities were written to the file, whether it was a
matrix
for RNA or SingleCellExperiment
for ADT:
h5 <- rhdf5::H5Fopen("cord_blood_citeseq.h5mu") h5ls(H5Gopen(h5, "mod"), recursive = FALSE)
... both assays for ADT — raw counts are stored in X
and CLR-transformed
counts are in the corresponding layer:
h5ls(H5Gopen(h5, "mod/scADT"), recursive = FALSE) h5ls(H5Gopen(h5, "mod/scADT/layers"), recursive = FALSE)
... as well as reduced dimensions (PCA):
h5ls(H5Gopen(h5, "mod/scADT/obsm"), recursive = FALSE) # There is an alternative way to access groups: # h5&'mod'&'scADT'&'obsm' rhdf5::H5close()
mudata (Python) documentation
muon documentation and web page
Stoeckius, M., Hafemeister, C., Stephenson, W., Houck-Loomis, B., Chattopadhyay, P.K., Swerdlow, H., Satija, R. and Smibert, P., 2017. Simultaneous epitope and transcriptome measurement in single cells. Nature methods, 14(9), pp.865-868.
Hao, Y., Hao, S., Andersen-Nissen, E., Mauck III, W.M., Zheng, S., Butler, A., Lee, M.J., Wilk, A.J., Darby, C., Zager, M. and Hoffman, P., 2021. Integrated analysis of multimodal single-cell data. Cell.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.