knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
SPACEconnect is a package to calculate and visualize the correlations between experimental gene expression profiles and the signatures from databases that have focused on detailing gene expression profiles based on the exposure of different cell lines to various perturbagen, such as varying concentration of various compounds, or identifying signatures that correlate with drug responses, or even the alteration in gene expression profile caused by silencing or activating a gene or a set of genes.
SPACEconnect R package combines rank correlation and gene set enrichment analysis to identify comprehensive analysis at the individual perturbagen level and, in the case of drugs, at the mode of action level. SPACEconnect proposes a novel approach to correlate perturbation signatures, rather than providing outputs for each perturbation experiment individually, it performs additional GSEA steps on the output based on the ranked correlation scores for each compound. Finally it excutes another GSEA step to combine individual compounds into Mechanism of Actions (MoA). The method can be used for general perturbation enrichment analysis where repetition of experiments under similar but not identical conditions (e.g. different drug concentrations or different cell lines) is a concern. There are databases focus on detailing gene expression profiles based on the exposure of different cell lines to varying concentration of various compounds, following the alteration of the expression of a single gene, or identifying signatures that correlate with drug responses, such as the L1000 connectivity map (Duan et al, 2014), as well as the drug sensitivity databases, such as the Genomics for Drug Sensitivity in Cancer (GDSC) database (Yang et al, 2013) or the Cancer Therapeutics Response Portal (CTRP) (Basu et al, 2013).
The package is available on our GitHub repository: "https://github.com/bigomics".
library("devtools") #install_github("bigomics/SPACEconnect")
The method of computing the perturbation enrichment, based on two major calculation steps. First, calculate the weighted pearson correlation between the fold change values of the experimental matrix and the fold change of genes under various perturbagens(obtain from a database). Then calculating the drugs enrichment using GSEA method, where the correlation coefficients thus obtained are used to perform a GSEA across all the datasets for a given perturbagen.
# sourse code within computePerturbEnrichment function, do not run it, it is for method illustration not part of the workfrlow # fist calculation of rank correlation gg <- intersect(rownames(mDrugEnrich), rownames(mFC)) rnk1 <- apply(mDrugEnrich[gg, , drop = FALSE], 2, rank, na.last = "keep") rnk2 <- apply(mFC[gg, , drop = FALSE], 2, rank, na.last = "keep") R1 <- WGCNA::cor(rnk1, rnk2, use = "pairwise") # The calculate the perturbation enrichment. xdrugs <- gsub("[@_].*$", "", colnames(mDrugEnrich)) meta.gmt <- tapply(colnames(mDrugEnrich), xdrugs, list) res <- list() for (i in 1:ncol(R1)) { suppressWarnings(res[[i]] <- fgsea::fgseaSimple(meta.gmt, stats = R1[, i], nperm = 1000)) }
However the GSEA calculation in this case, the plot instead of showing ranked genes, it shows the sorted correlation between the experiment of interest and each individual dataset in the present of a given perturbagen (Fig. 1), and the Enrichment Score (ES) indicates the presence of a positive or negative correlation between the cumulative list of experiments for a given perturbagen and the test profile.
{width="800"}
User can use Pearson correlation instead of GSEA, by specifying calculation method in computePeturbEnrichment function
computePeturbEnrichment(mFC, mDrugEnrich, DrugsAnnot, methods = c("GSEA", "cor"))
SPACEconnect consists of a group of functions and a group of datasets provided with the package as .rda files
Fold change matrix of differential gene expresion from RNAseq data of Multiple myelom. The RNAseq data from (Logie et al 2021) study. They compared the therapeutic efficacy of the phytochemical kinase inhibitor withaferin A with the clinically approved BTK inhibitor ibrutinib to target hyperactivated tyrosine kinase signaling in glucocorticoid-resistant multiple myeloma cells. The results demonstrate that withaferin-A induced cell death of glucocorticoid-resistant MM1R cells involves covalent cysteine targeting of multiple Hinge-6 domain type tyrosine kinases of the kinase cysteinome classification, including BTK.
load(system.file("data","mFC.rda", package = "SPACEconnect"))
This dataset is a matrix represents the drugs annotation that contains drugs' names, targets, mechanism of action , clinical phase, disease area, and indication. The source of this drugs annotations is L1000 repurposing drugs . It is for research use only. and not suitable for the Repurposing Hub to make clinical treatment decisions.
load(system.file("data","DrugsAnnot.rda", package = "SPACEconnect"))
It is large matrix represents the gene expression fold change of 1001 genes in the presence of 20220 different drugs concentrations and working time. The LINCS L1000 project has collected gene expression profiles for thousands of perturbagens at a variety of time points, doses, and cell lines. A full list of the chemical and genetic perturbations used can be found on the CLUE website along with their descriptions.
load(system.file("data","mDrugEnrich.rda", package = "SPACEconnect"))
The SPACEconnect package provides three categories of important functions: computePeturbEnrichment(), computeComboEnrichment() and plotDrugConnectivity().
However the package contains seven functions calculating and visualizing the correlations between experimental gene expression profiles and the signatures from databases.
In the following the illustration of each function, in the package workflow.
Within the package there are three R data objects .rda, user can load them as is shown in the previous section.
print("the fold change matrix:.....") head(mFC) print(" the drugs annotation datafrme:.......") head(DrugsAnnot) print(" the drugs activity large matrix:.....") mDrugEnrich[1:5,1:5]
To calculate the ranked correlation between the tested gene expression profile and other datasets signatures in the presence of a given perturbagen, based on database and annotation which is provided to the function. It can calculate the correlation of drug activity, drug sensitivity or other perturbation as gene knocked in/out.
drugs <- computePeturbEnrichment(mFC, mDrugEnrich, DrugsAnnot, methods = c("GSEA", "cor")) View(drugs)
drugs
is object contains 5 elements, cor
and GSEA
, are the both method to calculate the ranked correlation, each has the correlation coefficient, the pvalue and the q-value. The third list clust
in the matrix of optimized coordinates of UMAP clustering. The stats
is a matrix of the first ranking correlation coefficients calculation in all contrast of the fold change matrix of the tested expresssion profile. The annot
is the dataframe with drugs annotation.
Create the drug set enrichment object that is used in the visualization and creating a table summarize the Enrichment statistical values (NES
: normalized enrichment score, pvalue
, padj
: adjusted pvalue with Benjamini-Hochberg correction, drug
: drug's name, moa
: the mechanism of action and target
: is the drug targets from the annotation data. )
dsea <- getActiveDSEA(mDrugEnrich, DrugsAnnot, mFC) tb <- dsea$table
getMOA() is used to retrieve the drug's class and the corresponding mechanism of action of each drug.
Moa <- getMOA(dsea) DMoa <- Moa$drugClass Dtarget <- Moa$geneTargets
plotDrugConnectivity() plots the drug connectivity map or let us call it the GSEA plot for the top 16 drugs.
{plotDrugConnectivity(dsea = dsea)}
{width="559"}
The function plotMOA() plots mechanism of action the top 16 enriched drugs.
{plotMOA(dsea)}
{width="607"}
This plot shows the most positively or negatively correlated perturbagens across all test profiles. The size of the circles correspond to their relative activation, and are colored according to their upregulation (red) or downregulation (blue) in the contrast profile.
dseaPlotActmap (dsea, drugs, method = "GSEA", contr=colnames(mFC)[1], nterms = 50, nfc=20)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.