Do you like to live in the fast lane? Do you sigh every time you see a long vignette? Do you wish there was a TLDR on everything?
I understand.
Welcome to the Vesalius quick start!
In this vignette, you will find a very short guide on how to work with Vesalius. Vesalius provides an internal data set taken from real Spatial transcrtiptomic data.
We show in context Vesalius workflows in the articles section.
For all intents and purposes, this is a dummy data set and should only be used to get a feel for the Vesalius package. We strongly advise you to use real and complete data sets to fully understand the Vesalius package and its benefits.
Vesalius is a tool to perform high-resolution in silico anatomization and molecular characterization from Spatial omices data without requiring companion images. The nature of the vesalius algorithm makes it condusive to being using on various spatial omics modalities without requiuring substantial changes.
Vesalius achieves this by converting reducded dimensionality latent space into gray scale images (one of each dimension). These images are processed using various image analysis techniques such as smoothing and segmentation.
First, let's load the package and the data. Data originates from slide-seqV2 data (availble at Single Cell Portal).
We took a small subset of the mouse hippocampus centered around the Dentate Gyrus.
Here, we show an example using transcriptomic data but the same principles will apply for other modalities. For examples using other modalities, please refer to the other articles.
suppressMessages(library(vesalius)) library(patchwork) library(ggplot2) data(vesalius, package = "vesalius")
Once loaded we have two new variables: counts and cooridnates.
The counts are the gene counts for all genes. As per convention, columns r epresent the spatial indices (barcodes, spots, beads,...) and rows are the genes that were captured.
str(counts)
The coordinates represent the x and y coordinates for each spatial index.
str(coordinates)
In other modalities, count matrices are still required for vesalius to run.
For example, Spatial-ATAC-seq or Spatial-cut&tag can be converted into count matrices (more specifically tile matrices) using the ArchR package
We can build a vesalius assay by simply parsing both counts and cooridnates to the (vesalius assay contructor)[link to manual page].
vesalius <- build_vesalius_assay( coordinates = coordinates, # spatial coordinates counts = counts, # count matrix assay = "spatial_omics", # name you wish to give your assay verbose = FALSE # Do you want progress messages? )
A vesalius object is a container that requires at the very least spatial indices. Here, we parsed both counts and spatial indices
vesalius
Count matrices can be added sperately using the add_counts
function.
We can use this object top embed our latent space into grey scale images.
vesalius <- generate_embeddings(vesalius, dim_reduction = "PCA", normalization = "log_norm", nfeatures = 100, # Setting number of features low for low run time verbose = FALSE)
We see that we have added a set of embedding with one used one default embedding called the "active" embedding. See section below on understanding active embeddings.
vesalius
We can test out multiple embeddings and every trial will be stored into
the vesalius_assay
object.
vesalius <- generate_embeddings(vesalius, dim_reduction = "UMAP", nfeatures = 100, # Setting number of features low for low run time verbose = FALSE) vesalius <- generate_embeddings(vesalius, dim_reduction = "PCA", nfeatures = 200, # Setting number of features low for low run time verbose = FALSE) vesalius <- generate_embeddings(vesalius, dim_reduction = "PCA", verbose = FALSE)
Every time you run a new dimensionality reduction approach it will be appended to the previous ones. Vesalius does the same thing for different normalisation approaches. This lets you decided which combination you would want to use in later stages.
Note that since you might be want to use the same embedding more than once, you will end up with more than one embedding called "PCA" for example. In this case, each embedding or normalisation will have a unique name.
For every subsequent embedding called PCA, the trial name will be PCA.1, PCA.2 etc The same applies to any embedding or normalisation method you select.
vesalius
You can also add your own embeddings and they will be appended into vesalius
in a similar manner. You can also check the manual page for add_embeddings
To visualise embeddings, we can simply plot the vesalius object and specify which dimension we wish to view. See (imagePlot)[link to page] for more information. Since we are leaving the embedding argument as the default "last" value, we can only look at the 3 UMAP dimensions. We can look at them simulaneously by creating RGB images with one dimension per color channel.
p1 <- image_plot(vesalius, dimensions = 1) + labs(title = "Grey PCA dim 1") p2 <- image_plot(vesalius, dimensions = seq(1, 3)) + labs(title = "RGB PCA")
If we wanted to look at "UMAP" instead, we can simply specify the embedding. Since we used 30 dimensions for UMAP, we can use any dimensions to build RGB images.
p3 <- image_plot(vesalius, dimensions = 1, embedding = "UMAP") + labs(title = "Grey UMAP dim 1") p4 <- image_plot(vesalius, dimensions = c(1, 2, 3), embedding = "UMAP") + labs(title = "RGB UMAP dim 1, 2, and 3")
(p1 + p2) / (p3 + p4)
The active embedding slot in a vesalius_assay
object contains the embedding values
that will be used during image processing and image segmentation.
In the examples above, we tested multiple embeddings.
get_active_embedding_tag(vesalius)
We see that "UMAP.1" is used as the active embedding. Any time the embeddings are being used in subsequent functions they will be taken from here as this is the "last" embedding used.
In fact, in any subsequent function that contains the keyword "last" for one of its arguments, vesalius will use the last target created. This can mean the last embedding used, the last count matrix or the last segmentation / territory isolation / territory morphology used.
If you are unstaified with the results, you can always call the embedding or trial you want to use explictely and the active slot will be replace with a fresh version of your embeddings of choice. No need to re-run everything from scratch!
While this flexibility can be a boon, it can also be a curse! Anytime you explictely request an embedding or a trial it will use that fresh instance. This means that every time the active slot will be updated and any processing you may have done to it will be overwritten.
Once we have these images, we can apply image processing techniques to each gray scale image.
vesalius <- regularise_image(vesalius, lambda = 1) vesalius <- smooth_image(vesalius, sigma = 5, iter = 10) vesalius <- equalize_image(vesalius, sleft = 5, sright = 5)
You can apply any image processing method you wish and in any order you desire. Please
note that the default number of dimensions for these function is defined by seq(1,3)
giving the first three dimensions. If working with PCA, please set this argument
according to how many dimensions you wish to process.
This also alows for the selction of abitrary of PCs. For example, based on the grey scale images, you might decided that this would be a better choice of PCs
# Selecting a subset of PCs dims <- c(1, 3, 4, 5, 7:11) # running smoothing o vesalius <- regularise_image(vesalius, dimensions = dims, embedding = "PCA", verbose = FALSE) vesalius <- equalize_image(vesalius, dimensions = dims, verbose = FALSE) vesalius <- smooth_image(vesalius, dimensions = dims, iter = 10, sigma = 1, verbose = FALSE)
In this example, we explicetely requested the PCA embedding. This means that we will take a fresh instance of the PCA embedding and use this as our active embedding. Once specified, all subsequent functions will use this new active embedding unless specified otherwise.
However, note that dimensions are being parsed at every step. Only the images coressponding to these dimensions will be processed. Vesalius still retains the other ones just in case.
Vesalius will attempts to segment the images into color segments. Vesalius use a kmeans clustering approach to segment grey scale images. Note that the segmentation here is applied to the whole stack and not individually.
The goal is top obtain colour segments that we can subdivived into territories.
vesalius <- segment_image(vesalius, method = "kmeans", col_resolution = 2, verbose = FALSE)
We can see that the vesalius_assay object now contains some more information related to the segmentation.
vesalius
We can also have a look at the results of the image segmentation.
p5 <- image_plot(vesalius) + labs(title = "Segments only") print(p5)
Kmeans is the default setting for image segmentation but Vesalius also provides louvain and leiden based approaches.
The final step is to isolate color segments into seperate territories.
Similar color segments may be of the same colour but are seperated in 2D space.
We want to be able to isolate each patch. For this, we can use the isolate_territories
function.
vesalius <- isolate_territories(vesalius, capture_radius = 0.05)
Now we can have a look at the isolated territories. We can keep track of how many instances of image segmentation and image manipulation we have gone through.
vesalius
We can plot our territories using vesalius plotting functions. Note that this is
a ggplot
object as such it is customisable using ggplot2
functionalities.
p6 <- territory_plot(vesalius, cex_pt = 3.5) p6
Once we have our territories, we can compare the expression of genes between territories.
vesalius <- identify_markers(vesalius, seed = 1, query = 2) deg <- get_markers(vesalius)
And finally we can visualise the expression of our genes of interest. You can parse more than one gene. Here we show the overall expression profile of Malat1 but also its mean expression in each territory.
p7 <- view_gene_expression(vesalius, genes = "Malat1") p8 <- view_gene_expression(vesalius, genes = "Malat1", as_layer = TRUE) p7 + p8
Another key features of Vesalius is to be able to map cells across samples. Here, we will show a quick break down of how to achieve this. A more in depth view of cell mapping is available in the Vesalius analysis repository
First, we build 2 vesalius objects that we have process independantly. The jitter territory is the same as the original version but we flipped the coordinates, added noise to coordinates and counts, and remove a few randomly sampled locations.
# load vesalius data data(vesalius) # Create Vesalius object for processing vesalius <- build_vesalius_assay(coordinates, counts) cells <- sample(LETTERS[1:6], size = nrow(vesalius@tiles),replace =T) names(cells) <- vesalius@tiles$barcodes vesalius <- add_cells(vesalius, cells = cells, add_name = "Cells") jitter_ves <- build_vesalius_assay(jitter_coord, jitter_counts) cells <- sample(LETTERS[1:6], size = nrow(jitter_ves@tiles),replace =T) names(cells) <- jitter_ves@tiles$barcodes jitter_ves <- add_cells(jitter_ves, cells = cells, add_name = "Cells") vesalius <- generate_embeddings(vesalius, filter_threshold = 1, filter_grid = 1) vesalius <- smooth_image(vesalius, embedding = "PCA", sigma = 5, iter = 10) vesalius <- equalize_image(vesalius, sleft = 5, sright = 5) vesalius <- segment_image(vesalius, col_resolution = 2) vesalius <- isolate_territories(vesalius) jitter_ves <- generate_embeddings(jitter_ves, filter_threshold = 1, filter_grid = 1) jitter_ves <- smooth_image(jitter_ves, embedding = "PCA", sigma = 5, iter = 10) jitter_ves <- equalize_image(jitter_ves, sleft = 5, sright = 5) jitter_ves <- segment_image(jitter_ves, col_resolution = 2) jitter_ves <- isolate_territories(jitter_ves)
Next, we can proceed with the mapping of cells on to the other. We will run the mapping using cell (feature), niche, territory similarity and cell composition.
matched <- map_assays(vesalius, jitter_ves, threshold = 0, use_cost = c("feature","niche","territory","composition"), batch_size = 500, epoch = 5, jitter = 1)
The mapping produces a new vesalius object with mapped coordinates for the query data set.
In this case our query was jitter_ves
.
From here, we can treat this new object as a frech data sets with new coordinates. As such:
matched <- generate_embeddings(matched, dim_reduction = "PCA", nfeatures = 200, verbose = FALSE) matched <- smooth_image(matched, sigma = 5, iter = 10) matched <- equalize_image(matched, sleft = 5, sright = 5) matched <- segment_image(matched,col_resolution = 2)
We can then look at the resulting embeddings
m <- image_plot(vesalius) m1 <- image_plot(jitter_ves) m2 <- image_plot(matched) m + m1 + m2
We can also look at the new territories.
t <- territory_plot(vesalius) t1 <- territory_plot(jitter_ves) t2 <- territory_plot(matched) t + t1 + t2
Once we have our data sets and the new set of coordinates, we can integrate the data sets to obtain an integrate mapping object upon which we can perform differential gene expression analysis.
inter <- integrate_assays(matched, vesalius)
Since this data set now contains an integrated latent space, we can also call territories on this object. There is no need to create new embeddings. They are provided during integration.
inter <- smooth_image(inter, sigma = 5, iter = 10) inter <- equalize_image(inter, sleft = 5, sright = 5) inter <- segment_image(inter, col_resolution = 2)
We can view the resulting embeddings and territories.
i <- image_plot(inter) i1 <- territory_plot(inter) i + i1
Finally, we can look at DEG analysis between samples in specific territories.
inter <- identify_markers(inter, sample = TRUE) degs <- get_markers(inter) head(degs)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.