library(BioPlex) library(BioNet) library(EnrichmentBrowser) library(graph)
Get the latest version of the 293T PPI network:
bp.293t <- BioPlex::getBioPlex(cell.line = "293T", version = "3.0")
and turn into a graph object:
bp.gr <- BioPlex::bioplex2graph(bp.293t)
Get RNA-seq data for 293T cells from GEO: GSE122425
gse122425 <- BioPlex::getGSE122425()
Differential expression analysis between NSUN2 knockout and wild type:
gse122425$GROUP <- rep(c(0, 1), each = 3) gse122425 <- EnrichmentBrowser::deAna(gse122425, de.method = "DESeq2")
Inspect results:
rowData(gse122425) not.na <- !is.na(rowData(gse122425)$PVAL) gse122425 <- gse122425[not.na,] EnrichmentBrowser::volcano(rowData(gse122425)$FC, rowData(gse122425)$ADJ.PVAL)
Change rownames to gene symbols for intersection with bioplex graph:
rowData(gse122425)$ENSEMBL <- rownames(gse122425) rownames(gse122425) <- rowData(gse122425)$SYMBOL
Intersect bioplex graph and expression dataset:
syms <- graph::nodeData(bp.gr, graph::nodes(bp.gr), "SYMBOL") syms <- unlist(syms) isect <- intersect(syms, rownames(gse122425)) ind <- match(isect, syms) bp.sgr <- graph::subGraph(names(syms)[ind], bp.gr) bp.sgr bp.sgr <- BioNet::rmSelfLoops(bp.sgr) bp.sgr
Get the DE p-values for each node:
syms <- syms[ind] ind <- match(isect, rownames(gse122425)) pvals <- rowData(gse122425)[ind, "PVAL"] names(pvals) <- names(syms) head(pvals)
To score each node of the network we fit a Beta-uniform mixture model (BUM) to the p-value distribution and subsequently use the parameters of the model for the scoring function:
fb <- BioNet::fitBumModel(pvals) scores <- BioNet::scoreNodes(bp.sgr, fb, fdr = 0.05) head(scores) summary(scores)
Here we use a fast heuristic approach to calculate an approximation
to theoptimal scoring subnetwork.
An optimal solution can be calculated using the heinz algorithm requiring a
commercial CPLEX license.
module <- BioNet::runFastHeinz(bp.sgr, scores)
module <- BioPlexAnalysis:::.getResourceFromCache("heinz.module", update.value = NA) if(is.null(module)) { module <- BioNet::runFastHeinz(bp.sgr, scores) BioPlexAnalysis:::.cacheResource(module, "heinz.module") }
module
For some reason BioNet::runFastHeinz
drops the edgeData
of the input graph,
so we need to re-annotate those to the module.
There doesn't seem to be a convenient way to bulk transfer edgeData
from one
graph to another, but rather only for one edge and one attribute at a time.
With a little abuse, we help ourselves via accessing slots directly.
en <- names(edgeData(module)) edgeDataDefaults(module) <- edgeDataDefaults(bp.sgr) module@edgeData@data[en] <- edgeData(bp.sgr)[en]
kegg.gs <- EnrichmentBrowser::getGenesets(org = "hsa", db = "kegg", gene.id.type = "UNIPROT") head(kegg.gs) go.gs <- EnrichmentBrowser::getGenesets(org = "hsa", db = "go", gene.id.type = "UNIPROT") head(go.gs)
Given the maximum scoring subnetwork and gene sets of interest, we can now apply different set-based enrichment analysis (sbea) methods or network-based enrichment analysis (nbea) methods to identify biological themes within a module.
EnrichmentBrowser::sbeaMethods() EnrichmentBrowser::nbeaMethods()
Those are conveniently wrapped in the EnrichmentBrowser package and provide for a prioritization of gene sets of interest within a maximum scoring subnetwork.
To illustrate the concept, we use information on gene set enrichment from the paper.
NSUN2 is a nucleolus RNA m5C methyltransferase, regulator of cell proliferation and cell differentiation, dysfunction associated with intellectual diseases and cancer; reported to be a downstream target gene of MYC oncogene, a master regulator of cell proliferation.
Among the pathways showing enrichment in the paper is Ribosome, so as a showcase we extract nodes annotated to the gene set "Ribosome" from the maximum scoring subnetwork.
ind <- grep("Ribosome$", names(kegg.gs)) ribo <- kegg.gs[[ind]] ribo <- intersect(ribo, graph::nodes(module)) ribo.gr <- graph::subGraph(ribo, module) ribo.gr
Such a gene set induced sub-graph can then be further explored in our BioPlex node data and edge data shiny viewer.
For that we might be interested in mapping additional data onto the nodes and the edges of the network.
For example, we might want to add the Bioplex proteome data onto the graph for inspection via the shiny graph viewer.
bp.prot <- BioPlex::getBioplexProteome() bp.prot ribo.gr <- BioPlex::mapSummarizedExperimentOntoGraph(ribo.gr, bp.prot, col.names = grep("HEK", colnames(bp.prot), value = TRUE), rowdata.cols = c("nr.peptides", "log2ratio", "adj.pvalue"), prefix = "bpprot_") head(graph::nodeData(ribo.gr), n = 2)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.