For the spatial transcriptomics data, we have segmentation masks telling us where the cells are located. The masks themselves are not particularly usable -- it's hard to reason about raw pixels. However, they give information about cell-ecosystem heterogeneity and composition, and this has scientific and clinical meaning. This script produces some intermediate outputs that move us from raw pixel information into these more ecosystem-related (and human consumable) formats,

Here are the packages we'll use. Of note are raster for working with spatial images, sf, spdep, and stars for manipulating polygon data, and are igraph for network processing,

knitr::opts_chunk$set(message = FALSE, warning = FALSE)
library("SingleCellExperiment")
library("dplyr")
library("forcats")
library("ggplot2")
library("igraph")
library("raster")
library("readr")
library("reshape2")
library("sf")
library("spdep")
library("stars")
library("stringr")
library("BIRSBIO2020.scProteomics.embeddings")
theme_set(theme_bw() + theme(panel.grid=element_blank()))

First, we'll download all the data, if they're not already present in the $HOME/Data folder. We'll also crop to $250 \times 250$ subsets of the raw samples, so that the vignette can run quickly.

data_dir <- file.path(Sys.getenv("HOME"), "Data")
data_paths <- download_data(data_dir)
loaded_ <- load_mibi(data_dir)
subsample <- spatial_subsample(loaded_$tiffs, loaded_$mibi, 250)
mibi_sce <- subsample$exper

sample_id <- params$sample_id
im <- subsample$ims[[as.character(sample_id)]]

cell_data <- colData(mibi_sce) %>%
  as.data.frame() %>%
  tidyr::unite(scell, SampleID, cellLabelInImage, remove = FALSE) %>%
  mutate(
    cell_type = cell_type(mibi_sce),
    cell_group = fct_lump(cell_type, prop = 0.05)
  ) %>%
  filter(SampleID == sample_id)

Polygonization

It can be helpful to work directly with polygon geometries, rather than the original raster image. You can then use any logic you'd have used for manipulating geographic shapefiles, for example. The block below converts the

polys <- polygonize(im) %>%
  inner_join(cell_data) %>%
  group_by(cellLabelInImage) %>% # some regions get split into two adjacent polys --> merge
  inner_join(cell_data) %>%
  group_by(cellLabelInImage) %>%
  summarise_all(function(x) { x[1] })

With these shapefiles, we can use geographic plotting tools. For example, this plots the immune group for each geometry.

ggplot(polys) +
  geom_sf(aes(fill=cell_group)) +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) +
  theme(legend.position = "bottom")

Graph Construction

Sometimes we care less about the locations and sizes of individual cells than their connectivity and mixing structure. For example, in the plot below, nodes are located at the original cell positions, and contiguity and knn edges are blue and red, respectively -- for some analysis, it might be enough to work with this induced network alone.

nb_contig <- poly2nb(polys, snap=1) # contiguity graph
coords <- polys %>%
  .[["geometry"]] %>%
  st_centroid() %>%
  st_coordinates()

nb_knn <- knn2nb(knearneigh(coords, k = 5)) # knn graph
plot(polys$geometry[seq_len(nrow(polys))])
plot(nb_knn, coords, add=T, col="#476a79")
plot(nb_contig, coords, add=T, col="#97293f")

We can turn this into a more familiar igraph object, for access to the usual graph manipulation routines. The steps above have been wrapped in a small helper function extract_graph, so we can apply it on many patients easily.

G <- extract_graph(polys)
plot(G, vertex.size=3, edge.arrow.size=0, vertex.label=NA)

Raster Window Features

With these new data structures, we can compute some cell-level summaries. First, let's look at features defined on windows centered around individual cells.

The function below applys some cell-specific function to the buffered window around the cell with label cell_id. For example, we can use it to get the proportion of each type of immune group within a window around that cell. This can be used to define a localized cell heterogeneity for use in downstream analysis. We plot each of the image subwindows surrounding a single cell.

For example, below, we show the proportion of each cell's image neighborhood that belongs to background, effectively measuring the local density of cells in the tissue.

cell_ids <- unique(polys$cellLabelInImage)
loop_stats(cell_ids[1:10], "raster", im, polys, background_prop)

We can use the same wrapper to study the local cell mixing proportions.

loop_stats(cell_ids[1:10], "raster", im, polys, type_props)

Neighbor Features

Alternatively, we could ignore the spatial information entirely, and summarize cells according to the properties of their neighbors. Here are the same two statistics, computed by looking at the graph alone, and ignoring cell size entirely.

loop_stats(cell_ids[1:10], "graph", G, polys, background_prop)
loop_stats(cell_ids[1:10], "graph", G, polys, type_props)


krisrs1128/BIRSBIO2020.scProteomics.embeddings documentation built on July 29, 2020, 7:36 a.m.