For the spatial transcriptomics data, we have segmentation masks telling us where the cells are located. The masks themselves are not particularly usable -- it's hard to reason about raw pixels. However, they give information about cell-ecosystem heterogeneity and composition, and this has scientific and clinical meaning. This script produces some intermediate outputs that move us from raw pixel information into these more ecosystem-related (and human consumable) formats,
Here are the packages we'll use. Of note are raster
for working with spatial
images, sf
, spdep
, and stars
for manipulating polygon data, and are
igraph
for network processing,
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
library("SingleCellExperiment") library("dplyr") library("forcats") library("ggplot2") library("igraph") library("raster") library("readr") library("reshape2") library("sf") library("spdep") library("stars") library("stringr") library("BIRSBIO2020.scProteomics.embeddings") theme_set(theme_bw() + theme(panel.grid=element_blank()))
First, we'll download all the data, if they're not already present in the
$HOME/Data
folder. We'll also crop to $250 \times 250$ subsets of the raw
samples, so that the vignette can run quickly.
data_dir <- file.path(Sys.getenv("HOME"), "Data") data_paths <- download_data(data_dir) loaded_ <- load_mibi(data_dir) subsample <- spatial_subsample(loaded_$tiffs, loaded_$mibi, 250) mibi_sce <- subsample$exper sample_id <- params$sample_id im <- subsample$ims[[as.character(sample_id)]] cell_data <- colData(mibi_sce) %>% as.data.frame() %>% tidyr::unite(scell, SampleID, cellLabelInImage, remove = FALSE) %>% mutate( cell_type = cell_type(mibi_sce), cell_group = fct_lump(cell_type, prop = 0.05) ) %>% filter(SampleID == sample_id)
It can be helpful to work directly with polygon geometries, rather than the original raster image. You can then use any logic you'd have used for manipulating geographic shapefiles, for example. The block below converts the
polys <- polygonize(im) %>% inner_join(cell_data) %>% group_by(cellLabelInImage) %>% # some regions get split into two adjacent polys --> merge inner_join(cell_data) %>% group_by(cellLabelInImage) %>% summarise_all(function(x) { x[1] })
With these shapefiles, we can use geographic plotting tools. For example, this plots the immune group for each geometry.
ggplot(polys) + geom_sf(aes(fill=cell_group)) + scale_x_continuous(expand = c(0, 0)) + scale_y_continuous(expand = c(0, 0)) + theme(legend.position = "bottom")
Sometimes we care less about the locations and sizes of individual cells than their connectivity and mixing structure. For example, in the plot below, nodes are located at the original cell positions, and contiguity and knn edges are blue and red, respectively -- for some analysis, it might be enough to work with this induced network alone.
nb_contig <- poly2nb(polys, snap=1) # contiguity graph coords <- polys %>% .[["geometry"]] %>% st_centroid() %>% st_coordinates() nb_knn <- knn2nb(knearneigh(coords, k = 5)) # knn graph plot(polys$geometry[seq_len(nrow(polys))]) plot(nb_knn, coords, add=T, col="#476a79") plot(nb_contig, coords, add=T, col="#97293f")
We can turn this into a more familiar igraph
object, for access to the usual
graph manipulation routines. The steps above have been wrapped in a small helper
function extract_graph
, so we can apply it on many patients easily.
G <- extract_graph(polys) plot(G, vertex.size=3, edge.arrow.size=0, vertex.label=NA)
With these new data structures, we can compute some cell-level summaries. First, let's look at features defined on windows centered around individual cells.
The function below applys some cell-specific function to the buffered window
around the cell with label cell_id
. For example, we can use it to get the
proportion of each type of immune group within a window around that cell. This
can be used to define a localized cell heterogeneity for use in downstream
analysis. We plot each of the image subwindows surrounding a single cell.
For example, below, we show the proportion of each cell's image neighborhood that belongs to background, effectively measuring the local density of cells in the tissue.
cell_ids <- unique(polys$cellLabelInImage) loop_stats(cell_ids[1:10], "raster", im, polys, background_prop)
We can use the same wrapper to study the local cell mixing proportions.
loop_stats(cell_ids[1:10], "raster", im, polys, type_props)
Alternatively, we could ignore the spatial information entirely, and summarize cells according to the properties of their neighbors. Here are the same two statistics, computed by looking at the graph alone, and ignoring cell size entirely.
loop_stats(cell_ids[1:10], "graph", G, polys, background_prop) loop_stats(cell_ids[1:10], "graph", G, polys, type_props)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.