knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
options(timeout=180)

Introduction

Multimodal data format — MuDatahas been introduced to address the need for cross-platform standard for sharing large-scale multimodal omics data. Importantly, it develops ideas of and is compatible with AnnData standard for storing raw and derived data for unimodal datasets.

In R, multimodal datasets can be stored in Seurat objects. This MuDataSeurat package demonstrates how data can be read from MuData files (H5MU) into Seurat objects as well as how information from Seurat objects can be saved into H5MU files.

Installation

The most recent MuDataSeurat build can be installed from GitHub:

library(remotes)
remotes::install_github("PMBio/MuDataSeurat")

For the purpose of this tutorial, we will use SeuratData to obtain data in the form of Seurat objects:

devtools::install_github('satijalab/seurat-data')

Loading libraries

library(MuDataSeurat)
library(Seurat)
suppressWarnings(library(SeuratData))

library(hdf5r)

Writing H5MU files

We'll use a Seurat object distributed via SeuratData:

suppressWarnings(InstallData("cbmc"))
data("cbmc")
cbmc <- UpdateSeuratObject(cbmc)
cbmc

First, we make variable names unique across modalities:

# Append -ADT to feature names in the ADT assay
adt_counts <- cbmc[["ADT"]]@counts
rownames(adt_counts) <- paste(rownames(adt_counts), "ADT", sep = "-")
adt_data <- cbmc[["ADT"]]@data
rownames(adt_data) <- rownames(adt_counts)

adt <- CreateAssayObject(counts = adt_counts)
adt@data <- adt_data

cbmc_u <- CreateSeuratObject(cbmc[["RNA"]])
cbmc_u[["ADT"]] <- adt
DefaultAssay(cbmc_u) <- "ADT"
cbmc_u

We can then use WriteH5MU() to write the contents of the cbmc object to an H5MU file:

WriteH5MU(cbmc_u, "cbmc.h5mu")

Reading H5MU files

We can manually check the top level of the structure of the file:

h5 <- H5File$new("cbmc.h5mu", mode = "r")
h5

Or dig deeper into the file:

h5[["mod"]]
h5$close()

Creating Seurat objects from H5MU files

This package provides ReadH5MU to create an object with data from an H5MU file. Since H5MU structure has been designed to accommodate more structured information than Seurat, only some data will be read. For instance, Seurat has no support for loading multimodal embeddings or pairwise graphs.

cbmc_r <- ReadH5MU("cbmc.h5mu")
cbmc_r

Importantly, we recover the information from the original Seurat object:

head(cbmc_u@meta.data[,1:4])
head(cbmc_r@meta.data[,1:4])

H5AD files

If a Seurat object contains a single modality (assay), it can be written to an H5AD file.

For demonstration, we'll use a Seurat object with scRNA-seq counts distributed via SeuratDisk:

suppressWarnings(InstallData("pbmc3k"))
data("pbmc3k")
pbmc3k <- UpdateSeuratObject(pbmc3k)
pbmc3k

We can use WriteH5AD() to write the contents of the pbmc3k object to an H5AD file since this dataset contains a single modality (assay):

WriteH5AD(pbmc3k, "pbmc3k.h5ad")

This data can be retrieved from an H5AD file with ReadH5AD:

pbmc3k_r <- ReadH5AD("pbmc3k.h5ad")
pbmc3k_r

References

Session Info

sessionInfo()


PMBio/MuDataSeurat documentation built on Nov. 21, 2023, 7:31 a.m.