Introduction

10X Visium captures spatially resolved transcriptomic profiles in spots containing multiple cells. In this vignette, we will use the gene expression information from Visium data to investigate spatial relationships between pathway-specific genes.

Load the necessary packages:

# MISTy 
library(mistyR) 
library(future) 

#Seurat 
library(Seurat)
library(SeuratObject)

# Data manipulation 
library(tidyverse) 

# Pathways
library(decoupleR)

Get and load data

For this showcase, we use a 10X Visium spatial slide from Kuppe et al., 2022, where they created a spatial multi-omic map of human myocardial infarction. The tissue example data comes from the human heart of patient 14, which is in a later state after myocardial infarction. The Seurat object contains, among other things, the normalized and raw gene counts. First, we have to download and extract the file:

download.file("https://zenodo.org/records/6580069/files/10X_Visium_ACH005.tar.gz?download=1",
    destfile = "10X_Visium_ACH005.tar.gz", method = "curl")
untar("10X_Visium_ACH005.tar.gz")

The next step is to load the data, extract the normalized gene counts, names of genes expressed in at least 5% of the spots, and pixel coordinates. It is recommended to use pixel coordinates instead of row and column numbers since the rows are shifted and therefore do not express the real distance between the spots.

seurat_vs <- readRDS("ACH005/ACH005.rds")

expression <- as.matrix(GetAssayData(seurat_vs, layer = "counts", assay = "SCT"))
gene_names <- rownames(expression[(rowSums(expression > 0) / ncol(expression)) >= 0.05,]) 
geometry <- GetTissueCoordinates(seurat_vs, scale = NULL)

Obtain pathway-specific genes

First, we get the top 15 pathway-responsive human genes for each of the 14 available pathways in PROGENy.We will focus on two pathways - the VEGF pathway, which is responsible for promoting the formation of new blood vessels, and the TGF-beta pathway, which plays a critical role in regulating various cellular processes to promote tissue repair. We only extracted genes from these pathways that were present in the count matrix.

progeny <- get_progeny(organism = "human", top = 15)

VEGF_footprints <- progeny %>%
  filter(source == "VEGF", weight != 0, target %in% gene_names) %>% 
  pull(target)

TGFb_footprints <- progeny %>%
  filter(source == "TGFb", weight != 0, target %in% gene_names) %>% 
  pull(target)

Visualize gene expression

Before continuing with creating the Misty view, we can look at the slide itself and the expression of some of the selected pathway-reactive genes.

#Slide
SpatialPlot(seurat_vs, alpha = 0)

# Gene expression examples
SpatialFeaturePlot(seurat_vs, feature = c("ID1", "NID2"), keep.scale = NULL)

Misty views

Now we need to create the Misty views of interest. We are interested in the relationship of TGF-beta responsive genes in the same spot (intraview) and the ten closest spots (paraview). Therefore we choose the family constant which will select the ten nearest neighbors. Depending on the goal of the analysis, different families can be applied.

We are also intrigued about the relationship of VEGF-responsive genes with TGF-beta responsive genes in the broader tissue. For this, we again create an intra- and paraview, this time for VEGF, but from this view, we only need the paraview. In the next step, we add it to the TGF-beta views to achieve our intended views.

TGFb_views <- create_initial_view(t(expression[TGFb_footprints,]) %>% as_tibble()) %>%
  add_paraview(geometry, l=10, family = "constant")


VEGF_views <- create_initial_view(t(expression[VEGF_footprints,]) %>% as_tibble()) %>%
  add_paraview(geometry, l=10, family = "constant")

misty_views <- TGFb_views %>% 
  add_views(create_view("paraview.VEGF_10", VEGF_views[["paraview.10"]]$data, "para.VEGFn.10"))

Then run MISTy and collect the results:

run_misty(misty_views, "result/vignette_functional_pipeline") 

misty_results <- collect_results("result/vignette_functional_pipeline")

Downstream analysis

With the collected results, we can now answer the following questions:

1. To what extent can the surrounding tissues' gene expression explain the gene expression of the spot compared to the intraview?

Here we can look at two different statistics: multi.R2 shows the total variance explained by the multiview model. gain.R2 shows the increase in explainable variance from the paraviews.

misty_results %>%
  plot_improvement_stats("gain.R2") %>%
  plot_improvement_stats("multi.R2")

The paraviews particularly increase the explained variance for COMP, ID1, and COL4A1. In general, the significant gain in R2 can be interpreted as the following:

"We can better explain the expression of marker X when we consider additional views other than the intrinsic view."

To see the individual contributions of the views we can use:

misty_results %>% plot_view_contributions()

2. What are the specific relations that can explain gene expression?

The importance of the markers from each viewpoint as predictors of the expression of the intrinsic markers of the TGF-beta pathway can be shown individually to explain the contributions.

First, for the intrinsic view:

misty_results %>%
  plot_interaction_heatmap("intra", clean = TRUE)

We can observe that COL4A1 and ID1 are a significant predictor for the expression of several genes when in the same spot. ID1 is an important predictor for SMAD7. Let's take a look at the spatial distribution of these genes in the tissue slide:

SpatialFeaturePlot(seurat_vs, features = c("ID1", "SMAD7"), image.alpha = 0)

Areas with high levels of ID1 mRNA expression also tend to show high SMAD7 expression.

Now we repeat this analysis with the TGF-beta paraview. With trim we display only targets with a value above 0.5% for gain.R2. To set an importance threshold we apply cutoff.

misty_results %>%
  plot_interaction_heatmap(view = "para.10", 
                           clean = TRUE, 
                           trim = 0.5,
                           trim.measure = "gain.R2",
                           cutoff = 1.25)

From the gain.R2 we know that the paraview contributes a lot to explaining the ID1 expression. Let's visualize ID1 and its most important predictor COL4A1:

SpatialFeaturePlot(seurat_vs, features = c("COL4A1", "ID1"), image.alpha = 0)

The plots show us that, in some places, the localization of the mRNA overlaps.

Now we will analyze the last view, the VEGF-paraview:

misty_results %>%
  plot_interaction_heatmap(view = "para.VEGFn.10", clean = TRUE)

EPHA3 is a predictor of both ID1 and COL4A1.

SpatialFeaturePlot(seurat_vs, keep.scale = NULL, features = c("EPHA3","ID1"), image.alpha = 0)

A similar distribution as for ID1 can be observed for the expression of EPHA3.

See also

browseVignettes("mistyR")

Session Info

Here is the output of sessionInfo() at the point when this document was compiled.

sessionInfo()


saezlab/misty documentation built on March 25, 2024, 4:11 p.m.