knitr::opts_chunk$set( collapse = TRUE, comment = "#>", crop = NULL ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html )
library(MerfishData) library(ExperimentHub) library(ggplot2) library(grid)
Spatial transcriptomics protocols based on in situ sequencing or multiplexed RNA fluorescent hybridization can reveal detailed tissue organization. However, distinguishing the boundaries of individual cells in such data is challenging. Current segmentation methods typically approximate cells positions using nuclei stains.
Petukhov et al., 2021, describe Baysor, a segmentation method, which optimizes 2D or 3D cell boundaries considering joint likelihood of transcriptional composition and cell morphology. Baysor can also perform segmentation based on the detected transcripts alone.
Petukhov et al., 2021, compare the results of Baysor segmentation (mRNA-only) to the results of a deep learning-based segmentation method called Cellpose from Stringer et al., 2021. Cellpose applies a machine learning framework for the segmentation of cell bodies, membranes and nuclei from microscopy images.
Petukhov et al., 2021 apply Baysor and Cellpose to MERFISH data from cryosections of mouse ileum. The MERFISH encoding probe library was designed to target 241 genes, including previously defined markers for the majority of gut cell types.
Def. ileum: the final and longest segment of the small intestine.
Samples were also stained with anti-Na+/K+-ATPase primary antibodies, oligo-labeled secondary antibodies and DAPI. MERFISH measurements across multiple fields of view and nine z planes were performed to provide a volumetric reconstruction of the distribution of the targeted mRNAs, the cell boundaries marked by Na+/K+-ATPase IF and cell nuclei stained with DAPI.
The data was obtained from the datadryad data publication.
This vignette demonstrates how to obtain the MERFISH mouse ileum dataset from Petukhov et al., 2021 from Bioconductor's ExperimentHub.
eh <- ExperimentHub() AnnotationHub::query(eh, c("MerfishData", "ileum"))
mRNA molecule data: 820k observations for 241 genes
mol.dat <- eh[["EH7543"]] dim(mol.dat) head(mol.dat) length(unique(mol.dat$gene))
Image data:
dapi.img <- eh[["EH7544"]] dapi.img plot(dapi.img, all = TRUE) plot(dapi.img, frame = 1)
While total poly(A) and DAPI staining can provide feature-rich costains suitable for segmentation in cell-sparse tissues such as the brain, such stains are not as useful for segmentation in cellular-dense tissues. To address this challenge, Petukhov et al., 2021 developed protocols to combine immunofluorescence (IF) of a pan-cell-type cell surface marker, the Na+/K+-ATPase, with MERFISH.
mem.img <- eh[["EH7545"]] mem.img plot(mem.img, all = TRUE) plot(mem.img, frame = 1)
It is also possible to obtain the data in a SpatialExperiment, which integrates the segmented experimental data and cell metadata, and provides designated accessors for the spatial coordinates and the image data.
Obtain dataset segmented with Baysor:
spe.baysor <- MouseIleumPetukhov2021(segmentation = "baysor") spe.baysor
Inspect dataset:
assay(spe.baysor, "counts")[1:5,1:5] assay(spe.baysor, "molecules")["Acsl1",5] colData(spe.baysor) head(spatialCoords(spe.baysor)) imgData(spe.baysor)
Obtain dataset segmented with Cellpose:
spe.cellpose <- MouseIleumPetukhov2021(segmentation = "cellpose", use.images = FALSE) spe.cellpose
Inspect dataset:
assay(spe.cellpose, "counts")[1:5,1:5] colData(spe.cellpose) head(spatialCoords(spe.cellpose))
Here we inspect the difference in cell counts for the both segmentation methods, stratified by cell type label obtained from leiden clustering and annotation by marker gene expression:
seg <- rep(c("baysor", "cellpose"), c(ncol(spe.baysor), ncol(spe.cellpose))) ns <- table(seg, c(spe.baysor$leiden_final, spe.cellpose$leiden_final)) df <- as.data.frame(ns, responseName = "n_cells") colnames(df)[2] <- "leiden_final" ggplot(df, aes( reorder(leiden_final, n_cells), n_cells, fill = seg)) + geom_bar(stat = "identity", position = "dodge") + xlab("") + ylab("Number of cells") + theme_bw() + theme( panel.grid.minor = element_blank(), axis.text.x = element_text(angle = 45, hjust = 1))
For visualization purposes, we focus in the following on the first z-plane of the membrane staining image.
mem.img <- imgRaster(spe.baysor, image_id = "membrane")
Overlay cell type annotation as in Figure 6 of the publication.
spe.list <- list(Baysor = spe.baysor, Cellpose = spe.cellpose) plotTabset(spe.list, mem.img)
We can also overlay the individual molecules of selected marker genes such as the different cluster of differentiation genes assayed in the experiment:
gs <- grep("^Cd", unique(mol.dat$gene), value = TRUE) ind <- mol.dat$gene %in% gs rel.cols <- c("gene", "x_pixel", "y_pixel") sub.mol.dat <- mol.dat[ind, rel.cols] colnames(sub.mol.dat)[2:3] <- sub("_pixel$", "", colnames(sub.mol.dat)[2:3]) plotXY(sub.mol.dat, "gene", mem.img)
Here, we illustrate segmentation borders for the first z-plane:
poly <- metadata(spe.baysor)$polygons poly <- as.data.frame(poly) poly.z1 <- subset(poly, z == 1)
We add holes to the cell polygons:
poly.z1 <- addHolesToPolygons(poly.z1)
Plot over membrane image:
p <- plotRasterImage(mem.img) p <- p + geom_polygon( data = poly.z1, aes(x = x, y = y, group = cell, subgroup = subid), fill = "lightblue") p + theme_void()
The MERFISH mouse ileum dataset is part of the gallery of publicly available MERFISH datasets.
This gallery consists of dedicated iSEE and Vitessce instances, published on Posit Connect, that enable the interactive exploration of different segmentations, the expression of marker genes, and overlay of cell metadata on a spatial grid or a microscopy image.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.