knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)

Introduction

SpatialDE by Svensson et al., 2018, is a method to identify spatially variable genes (SVGs) in spatially resolved transcriptomics data.

Installation

You can install r BiocStyle::Biocpkg("spatialDE") from Bioconductor with the following code:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}
BiocManager::install("spatialDE")

Example: Mouse Olfactory Bulb

Reproducing the example analysis from the original r basilisk::PyPiLink("SpatialDE") Python package.

library(spatialDE)
library(ggplot2)

Load data

Files originally retrieved from SpatialDE GitHub repository from the following links: https://github.com/Teichlab/SpatialDE/blob/master/Analysis/MouseOB/data/Rep11_MOB_0.csv https://github.com/Teichlab/SpatialDE/blob/master/Analysis/MouseOB/MOB_sample_info.csv

# Expression file used in python SpatialDE. 
data("Rep11_MOB_0")

# Sample Info file used in python SpatialDE
data("MOB_sample_info")

The Rep11_MOB_0 object contains spatial expression data for r nrow(Rep11_MOB_0) genes on r ncol(Rep11_MOB_0) spots, with genes as rows and spots as columns.

Rep11_MOB_0[1:5, 1:5]
dim(Rep11_MOB_0)

The MOB_sample_info object contains a data.frame with coordinates for each spot.

head(MOB_sample_info)

Filter out pratically unobserved genes

Rep11_MOB_0 <- Rep11_MOB_0[rowSums(Rep11_MOB_0) >= 3, ]

Get total_counts for every spot

Rep11_MOB_0 <- Rep11_MOB_0[, row.names(MOB_sample_info)]
MOB_sample_info$total_counts <- colSums(Rep11_MOB_0)
head(MOB_sample_info)

Get coordinates from MOB_sample_info

X <- MOB_sample_info[, c("x", "y")]
head(X)

stabilize

The SpatialDE method assumes normally distributed data, so we stabilize the variance of the negative binomial distributed counts data using Anscombe's approximation. The stabilize() function takes as input a data.frame of expression values with samples in columns and genes in rows. Thus, in this case, we have to transpose the data.

norm_expr <- stabilize(Rep11_MOB_0)
norm_expr[1:5, 1:5]

regress_out

Next, we account for differences in library size between the samples by regressing out the effect of the total counts for each gene using linear regression.

resid_expr <- regress_out(norm_expr, sample_info = MOB_sample_info)
resid_expr[1:5, 1:5]

run

To reduce running time, the SpatialDE test is run on a subset of 1000 genes. Running it on the complete data set takes about 10 minutes.

# For this example, run spatialDE on the first 1000 genes
sample_resid_expr <- head(resid_expr, 1000)

results <- spatialDE::run(sample_resid_expr, coordinates = X)
head(results[order(results$qval), ])

model_search

Finally, we can classify the DE genes to interpetable DE classes using the model_search function. We apply the model search on filtered DE results, using a threshold of 0.05 for the Q-values.

de_results <- results[results$qval < 0.05, ]

ms_results <- model_search(
    sample_resid_expr,
    coordinates = X,
    de_results = de_results
)

# To show ms_results sorted on qvalue, uncomment the following line
# head(ms_results[order(ms_results$qval), ])

head(ms_results)

spatial_patterns

Furthermore, we can group spatially variable genes (SVGs) into spatial patterns using automatic expression histology (AEH).

sp <- spatial_patterns(
    sample_resid_expr,
    coordinates = X,
    de_results = de_results,
    n_patterns = 4L, length = 1.5
)
sp$pattern_results

Plots

Visualizing one of the most significant genes.

gene <- "Pcp4"

ggplot(data = MOB_sample_info, aes(x = x, y = y, color = norm_expr[gene, ])) +
    geom_point(size = 7) +
    ggtitle(gene) +
    scale_color_viridis_c() +
    labs(color = gene)

Plot Spatial Patterns of Multiple Genes

As an alternative to the previous figure, we can plot multiple genes using the normalized expression values.

ordered_de_results <- de_results[order(de_results$qval), ]

multiGenePlots(norm_expr,
    coordinates = X,
    ordered_de_results[1:6, ]$g,
    point_size = 4,
    viridis_option = "D",
    dark_theme = FALSE
)

Plot Fraction Spatial Variance vs Q-value

FSV_sig(results, ms_results)

SpatialExperiment integration

The SpatialDE workflow can also be executed with a r BiocStyle::Biocpkg("SpatialExperiment") object as input.

library(SpatialExperiment)

# For SpatialExperiment object, we neeed to transpose the counts matrix in order
# to have genes on rows and spot on columns. 
# For this example, run spatialDE on the first 1000 genes

partial_counts <- head(Rep11_MOB_0, 1000)

spe <- SpatialExperiment(
  assays = list(counts = partial_counts),
  spatialData = DataFrame(MOB_sample_info[, c("x", "y")]),
  spatialCoordsNames = c("x", "y")
)

out <- spatialDE(spe, assay_type = "counts", verbose = FALSE)
head(out[order(out$qval), ])

Plot Spatial Patterns of Multiple Genes (using SpatialExperiment object)

We can plot spatial patterns of multiples genes using the spe object.

spe_results <- out[out$qval < 0.05, ]

ordered_spe_results <- spe_results[order(spe_results$qval), ]

multiGenePlots(spe,
    assay_type = "counts",
    ordered_spe_results[1:6, ]$g,
    point_size = 4,
    viridis_option = "D",
    dark_theme = FALSE
)

Classify spatially variable genes with model_search and spatial_patterns

msearch <- modelSearch(spe,
    de_results = out, qval_thresh = 0.05,
    verbose = FALSE
)

head(msearch)
spatterns <- spatialPatterns(spe,
    de_results = de_results, qval_thresh = 0.05,
    n_patterns = 4L, length = 1.5, verbose = FALSE
)

spatterns$pattern_results

sessionInfo {-}

Session info

Sys.time()
sessionInfo()



sales-lab/spatialDE documentation built on Feb. 12, 2024, 2:47 p.m.