knitr::opts_chunk$set(echo = TRUE)
The detection of tumor-host interaction is a major challange in cancer understanding. More recent advances in single cell next generation sequencing have provided new insights in unveiling tumor microenvironment (TME) cells comunications. The scTHI package provides some useful functions to identify and visualize the cell types making up the TME, and a rank-based approach to discover interesting ligand-receptor interactions establishing among malignant and non tumor cells
if(!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("scTHI")
The scTHI package requires a matrix of count data (or normalized counts) from single cell RNA-seq experiments, where rows are genes presented with Hugo Symbols and columns are cells. For a practical demonstration, we will use the scRNA-seq data available on the Broad Institute Single-Cell Portal . The dataset provides 3,321 scRNA-seq profiles from six primary Glioma (H3K27M-glioma), and includes both malignant cells and several tumor microenvironment cell types, as immune cells and oligodendrocytes. Our goal is to identify the significant ligand-receptor interactions that are established between cancer and immune cells present in the tumor microenvironment. For this reason, after downloading H3K27M-glioma data we preprocessed them, selecting only one sample of interest (i.e. BCH836), removing not-expresse genes, transforming data in log TPM , and applying quantile normalization. Processed are included in the package and can be loaded as follows:
library(scTHI) library(scTHI.data) set.seed(1234) data("H3K27") data("H3K27.meta")
H3K27.meta includes the annotation of each cell, so as to identify the tumor cells and the immune cells in BCH836 patient:
We will use the cells annotated as Immune and a subset of Malignant.
Malignant <- rownames(H3K27.meta)[H3K27.meta$Type == "Malignant"][1:100] Immune <- rownames(H3K27.meta)[H3K27.meta$Type == "Immune cell"]
scTHI.score can be used to detect significant
ligand-receptor interactions occurring between cancer and immune cells.
The function uses a dataset of 1044 ligand-receptor pairs to identify
those which are significantly enriched between pairs clusters of your
dataset. A score is computed for each interaction pair. This score does
not simply reflect the average expression of the two partners in the
interaction pair, it is rather based on the percentage of cells that
express the interaction pair at the top of the ordered gene list. It
considers a gene expressed in a cell, only if it is in the top 10% of
the ranked gene list, as set in
Interactions are not symmetric, so partnerA expression is considered on the first cluster (in the example the Malignant cells), and partnerB expression is considered on the second cluster (i.e. Immune cells). A p-value for each pair is computed, based on a null model from random permutations of data, and and only significant interaction pairs are reported.
H3K27.result <- scTHI_score( expMat = H3K27, cellCusterA = Malignant, cellCusterB = Immune, cellCusterAName = "Malignant", cellCusterBName = "Immune", topRank = 10, PValue = TRUE, pvalueCutoff = 0.05, nPermu = 10, ncore = 1 )
For which partner A and partner B are expressed in at least 50%
of the cells of the respective clusters, as set in
argument. P-value computation is optional and is estimated by
permuting the mates of interaction pairs. Result also includes
two columns with the average expression values of each partner
in the respective clusters.
scTHI.plotResult with argument
"score" displays the score of each significant pair through
barplots. The number of asterisks indicates significance.
Otherwise, setting the argument
plotType as "pair", for each
pair are shown bar plots displaying the percentage of cells in
each cluster expressing partner A and partner B, respectively.
scTHI_plotResult(scTHIresult = H3K27.result, cexNames = 0.7, plotType = "score")
scTHI_plotResult(scTHIresult = H3K27.result, cexNames = 0.7, plotType = "pair")
Another useful way to explore results is the visualization of the entire data set through the t-SNE plot. scTHI allows the user to compute the non-linear dimensionality reduction of data using the scTHI.runTsne function based on the Rtsne R package .
H3K27.result <- scTHI_runTsne(scTHIresult = H3K27.result)
Users can decide to display on the tsne the two input clusters, i.e. Maglignant and Immune cells, or to observe the intensity of the expression of an intresting interaction pair in the whole dataset. The scTHI.plotCluster function differently labels cells on t-SNE if they belong to ClusterA or ClusterB.
scTHI_plotCluster(scTHIresult = H3K27.result, cexPoint = 0.8, legendPos = "bottomleft")
On the other hand the scTHI.plotPairs function colors the cells on the t-SNE based on the expression levels of a user-defined interaction pair. For example, let’s try plotting the expression of the interaction pair consisting of THY1, which should be expressed by the cancer cells and the ITGAX-ITGB2 complex, instead expressed by the immune cells.
scTHI_plotPairs( scTHIresult = H3K27.result, cexPoint = 0.8, interactionToplot = "THY1_ITGAX:ITGB2" )
The THY1 gene appears uniformly expressed in tumor cells, and in according to results table, it was expressed in 861% of malignant cells at the top 10% of the ranked gene list of each cell. On the other hand the ITGAX-ITGB2 complex is highly specific of the Immune cluster, where it is expressed by 92% of the cells at the top 10% of the ranked gene list of each cell.
Another important problem in study tumor-host interaction is
the classification of all cell types that make up the tumor
microenvironment. Generally, tools for single cell data analysis
are based on the use of marker genes to classify different cell
types identified by clustering approaches. This strategy is not
always the most accurate, our approach uses a different method
for the classification of TME based on gene signature enrichment
TME.classification implements the
Mann-Whitney-Wilcoxon Gene Set Test (MWW-GST) algorithm 
and tests for each cell the enrichment of a collection of
signatures of different cell types, with greater interest for
the Immune System and the Central Nervous System cells. Thus,
each cell is associated with a specific phenotype based on the
more significant enriched gene set.
Let's try to classify the TME cells of the BCH836 patient,
excluding malignant cells from the dataset.
Tme.cells <- rownames(H3K27.meta)[which(H3K27.meta$Type != "Malignant")] H3K27.tme <- H3K27[, Tme.cells]
TME.classification function can take a
long time if the number of cells to test is large.
Class <- TME_classification(expMat = H3K27.tme, minLenGeneSet = 10)
The H3K27M dataset annotation table suggests that non-tumor cells from BCH836 patient are mainly composed by Oligodendrocyte and Immune cells, respectively. The TME.classification function identifies a similar number of Oligodendrocytes and Immune cells, distinguishing the latter mainly in Microglia, Monocytes and Macrophages and fewer other Immune cells.
table(Class$Class[Tme.cells], H3K27.meta[Tme.cells, "Type"])
Finally, using the previously calculated t-SNE or a set of user-defined coordinates, you can view the new classification of the TME.
TME_plot(tsneData = H3K27.result$tsneData, Class, cexPoint = 0.8)
Filbin, M. G., Tirosh, I., Hovestadt, V., Shaw, M. L., Escalante, L. E., Mathewson, N. D., ... & Haberler, C. (2018). Developmental and oncogenic programs in H3K27M gliomas dissected by single-cell RNA-seq. Science, 360(6386), 331-335.
Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(Nov), 2579-2605.
Frattini, V., Pagnotta, S. M., Fan, J. J., Russo, M. V., Lee, S. B., Garofano, L., ... & Frederick, V. (2018). A metabolic function of FGFR3-TACC3 gene fusions in cancer. Nature, 553(7687), 222.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.