In chemokine/OmicsSBGN: Overlay omics data onto SBGN pathway diagram

library(knitr)

About

SBGNview is a tool set for visualizing omics data on pathway maps and pathway related data extraction/analysis. Pathway is rendered with community standard notation: SBGN[@le2009systems]. Given omics data and a pathway file (SBGN-ML format with layout information), SBGNview can display omics data as colors on glyphs and output image files. For omics data, SBGNview supports automatic ID mapping of common gene/protein/compound ID types (e.g. Entrez Gene ID, UNIPROT, ChEBI etc.). For pathway files, SBGNview can automatically retrieve SBGN-ML files from common pathway databases (e.g. Reactome, MetaCyc, SMPDB, PANTHER, METACROP etc.). To support visualizing multiple types of data on the same glyph/arc, SBGNview provides extensive options to control glyph and edge features (e.g. color, line width, label color/size etc.). To facilitate pathway based analysis, SBGNview provides functions to search for pathways using keywords, extract node information (e.g. gene set, compound set) and shortest path between two nodes etc.

Introduction

Molecular pathways have been widely used in omics data analysis. We previously developed an R/BioConductor package called Pathview, which maps, integrates and visualizes omics data onto KEGG pathway graphs[@luo2013pathview]. Since its publication, Pathview has been widely used in numerous omics studies and analysis tools. Here we introduce the SBGNview package, which adopts Systems Biology Graphical Notation (SBGN)[@le2009systems] and greatly extends the Pathview project by supportting multiple major pathway databases besides KEGG.

Key features:

Pathway diagram is drawn with SBGN notations

Inconsistant graph notations make pathway diagrams hard to interpret. For example, to represent catalysis relationship between two nodes, scientists have been using different arc types. SBGN is a community developed notation standard and has been used by major pathway databases (e.g. Reactome, SMPDB, PANTHER, METACROP etc.). SBGN defines a set of glyphs (nodes) to represent different types of molecules (proteins, simple chemicals etc.) and arcs (edges) to represent different types of relationships (e.g. stimulation, consumption, production, inhibition etc.). It also provides container glyphs to represent cellular location information (cellular compartments like cytosol, nucleus etc. ) and molecule complexes e.g. protein complexes, dimers). Therefore, it provides more intuitive context information about pathways than simple networks (e.g. Cytoscape network). For details about SBGN, please check http://sbgn.github.io/sbgn

Supports major pathway databases and user defined pathways.

As a community standard, SBGN is adopted by major pathway databases. They provide pathway maps in SBGN-ML format[@van2012software]. These include Reactome, Panther, PathwayCommons, MetaCrop etc. Therefore, users can use SBGNview to visualize and interpret their omics data on any pathways from these databases. In additions, molecular biologists often summarize new discoveries or literature knowledge in pathways, and create their own pathway maps. This can be done in SBGN editing/drawing tools and the pathways can be saved as SBGN-ML files. This makes SBGNview much more flexible than existing tools such as Pathview and PaintOmics, which only support KEGG pathways.

Extensive choices for graphical control.

Like Pathview, SBGNview supports multiple measurements for each gene/compound. In addition, it provides rich options to control glyph/arc attributes such as line color/width and text size/color/wrapping/positioning etc. This gives users maximal control over the pathway graphs.

Pathway related data extraction and analysis
- Search and automatically download pathway files from major databases. Keywords can be pathway discription and molecule ID (gene symbol, compound name, UNIPROT, CHEBI, UNICHEM etc.)
- Pathway gene/compound set mapping and extraction. SBGNview can map input IDs to molecules in pathways. The mapping result can be used for gene set enrichment analysis.
- Extract nodes in pathways such as complex members, compartment members, node to class mapping.

Installation

Prerequisites

SBGNview depends on the following packages:

xml2: parse SBGN-ML files
rsvg: convert svg files to other formats (pdf, png, ps). librsvg2 is needed to install rsvg. See this page for more details: https://github.com/jeroen/rsvg
pathview: map between different ID types for gene and chemical compound.
igraph: find shortest paths
SummarizedExperiment: parse SummarizedExperiment objects

Install SBGNview

if (!requireNamespace("BiocManager", quietly = TRUE)){
     install.packages("BiocManager")
}
BiocManager::install( c("pathview", "xml2", "igraph", "rsvg", "SummarizedExperiment"))

Install SBGNview itself through Bioconductor

BiocManager::install(c("SBGNview"))

Install SBGNview through GitHub:

install.packages("devtools")
devtools::install_github("datapplab/SBGNview.data")
devtools::install_github("datapplab/SBGNview")

Clone the Git repository:

git clone  https://github.com/datapplab/SBGNview.git

Overview

To visualize omics data on SBGN pathway map, we need two inputs:

A SBGN-ML file containing the pathway information: nodes,edges and their layout (coordidnates).
An omics data table in which rows are genes/compounds and columns are different measurements. The measurement can be any numeric values, such as fold change, abundance, mutation etc.

Given these two inputs, SBGNview can display omics data of each gene/compound on its corresponding node in the SBGN map. Each measured value will be displayed as a color corresponding to the value. When there are multiple samples/experiments, nodes are divided into multiple slices correspondingly. The output images can be in SVG, PDF, PNG or PS format.

A quick example {#quickStart}

A quick example to visualize a demo gene expression dataset on pathway "Adrenaline and noradrenaline biosynthesis" and highlight several interesting nodes, edges and path.

# load demo dataset and pathway information of built-in collection of SBGN-ML files
library(SBGNview)
data("gse16873.d","pathways.info","sbgn.xmls")
# search for pathways with user defined keywords
input.pathways <- find.pathways("Adrenaline and noradrenaline biosynthesis")
# render SBGN pathway graph and output image files
SBGNview.obj <- SBGNview(
          gene.data = gse16873.d[,1:3], 
          gene.id.type = "entrez",
          input.sbgn = input.pathways$pathway.id,
          output.file = "quick.start", 
          output.formats = c("png")
          ) 
print(SBGNview.obj)

Two image files (a .svg file and a .pdf file) will be created in the current working directory:

list.files( pattern = "quick.start", full.names = TRUE)

```rQuick start example: Adrenaline and noradrenaline biosynthesis pathway. "} include_graphics("quick.start_P00001.svg")

[Link to SBGN notation](https://cdn.rawgit.com/sbgn/process-descriptions/b2904462d11bd8d65e9c7a1318d95d468048cb50/templates/PD_L1V1.3.svg)
In this example, the [original pathway SBGN-ML file is from pathwayCommons (see below)](http://apps.pathwaycommons.org/pathways?uri=http%3A%2F%2Fidentifiers.org%2Fpanther.pathway%2FP00001) with [improved layout](#ourCollection)(node-edge overlaps are removed by routed edges). 


We can highlight nodes, edges and path:

```r
output.file(SBGNview.obj) <- "quick.start.highlight.elements"
SBGNview.obj + 
        highlight.arcs(class = "production",color = "red") +
        highlight.arcs(class = "consumption",color = "blue") +
        highlight.nodes(node.set = c("tyrosine", "(+-)-epinephrine"),
                       stroke.width = 4, stroke.color = "green") + 
        highlight.path(from.node = "tyrosine", to.node = "dopamine",
                      from.node.color = "green",
                      to.node.color = "blue",
                      shortest.paths.cols = "purple",
                      input.node.stroke.width = 6,
                      path.node.stroke.width = 5,
                      path.node.color = "purple",
                      path.stroke.width = 5,
                      tip.size = 10 )

```rQuick start example: Adrenaline and noradrenaline biosynthesis pathway. Highlight nodes and edges."} include_graphics("quick.start.highlight.elements_P00001.svg")

The color of consumption arcs and production arcs are set to blue and red, respectively. 

Tyrosine and epinephrine are highlighted by thicker border (stroke width) and green color. Note that there are four nodes mapped to (+-)-epinephrine. 

A shortest path from tyrosine to dopamine is highlighted with purple arcs and nodes. The start (Tyrosine) and end (epinephrine) nodes have thicker border and different colors. Since there are multiple dopamines in the map, a random dopamine node is selected.  If user wants a specific node, function *change.ids* can help find the node IDs corresponding to input IDs. Then user can run *highlight.nodes* and/or *highlight.path* again with "node IDs" instead of "compound name". See [this example](#findNode) for details.



# Getting started

*SBGNview* is the main function to overylay omics data on SBGN pathway maps. It extracts node and edge data from SBGN-ML file and creates a SBGN graph in SVG format. Then it maps omics data to the glyphs and renders the graph with mapped data as colors. Currently it maps gene/protein omics data to "macromolecule" glyphs and maps compound omics data to "simple chemical" glyphs. Please see its documentation for more details. The *SBGNview* function returns a *SBGNview* object, it contains information necessary to render SBGN graph and can be further modified to change graph features. See [this section](#sbgnviewObj) for more details.

## SBGN pathway file (SBGN-ML)
SBGN pathway is defined in a special XML format (SBGN-ML file). It contains information of the pathway content (molecules) as well as graph layout information. There are two main types of data in SBGN-ML files:
1. node data (in tag "glyph"), such as node location, width, hight and node class(macromolecule, simple chemical etc.). 
2. edge data (in tag "arc"), such as arc class, start node and end node.
For more details, see:
https://github.com/sbgn/sbgn/wiki/SBGN_ML

### SBGN-ML pathway file from online databases

Several online databases provide SBGN-ML files, such as pathwayCommons, Reactome and MetaCrop. They can be downloaded from their webpage or FTP site.

### SBGNview's SBGN-ML file collection{#ourCollection}

Many pathways from the above databases don't have desirable layout and often have extensive node-node overlaps and node-edge crossings. Thus we refined the layout and removed node-node overlaps. For node-edge crossings, we computed spline edges to resolve this issue and added additional elements in the SBGN-ML file to encode spline edges. The resulting collection of SBGN-ML files are available in a separate [GitHub repository](https://github.com/datapplab/SBGN-ML.files/tree/master/data/SBGN). **SBGNview** can automatically search in this pathway collection and download the SBGN-ML files.  Users can further modify the SBGN-ML files using other tools (e.g. [newt editor](http://newteditor.org/)) for desired node layout. 
The package used to layout nodes and route spline edges is currently under development and will be released in the near future. 

#### Information about pre-generated SBGN-ML file collection
We can check the information of all pre-generated SBGN-ML files
```r
data("pathways.stat")
gse16873 <- gse16873.d[,1:3]
input.pathway.ids = input.pathways$pathway.id
head(pathways.info)
pathways.stat

There are two common scenarios of using SBGNview

Using our pre-generated SBGN-ML files

In this scenario, SBGNview can automatically download pathway files and map between ID types. * Using SBGN-ML files from other sources.

In this scenario, more parameters and/or ID mapping files are needed. Please see the documentation of function SBGNview for more details.

Search for pathways by keywords {#searchPathways}

SBGNview has several functions to search for pathways by keyword and automatically download SBGN-ML files.

pathways <- find.pathways(c("bile acid","bile salt"))
head(pathways)
pathways.local.file <- download.sbgn.file(pathways$pathway.id[1:3])
pathways.local.file

By default find.pathways searches for keywords in pathway names. It can also search by different ID types

pathways <- find.pathways(c("tp53","Trp53"),keyword.type = "SYMBOL")
head(pathways)
pathways <- find.pathways(c("K04451","K10136"),keyword.type = "KO")
head(pathways)

Different layout for the same pathway

Researchers may have different tastes for a "good looking" layout. We have created differen layouts for each pathway. User can download them from pre-generated SBGN-ML files and try.

User customized SBGN-ML file

We can also create a SBGN-ML file from scratch. Several tools like Newt editor (http://newteditor.org/) can let the user draw a pathway diagram and save it as SBGN-ML file. The tools may also able to generate a primitive pathway layout. But these layouts often have too many node-node overlaps and edge-node crossings. Therefore, we recommend the user to use our SBGN-ML file collection mentioned above, which have been optimized to solve these problems.

Omics data

SBGNview can visualize a range of omics data, including both gene (or transcript, protein, enzyme) data and compound (or metabolite, chemical, small molecules) data.

Gene expression data

Gene/protein related data will be mapped to "macromolecule" nodes on a SBGN map.

Chemical compound data

Chemical compound data will be mapped to "simple chemical" nodes on a SBGN map.

Here we simulate a compound dataset.

cpd.data <- sim.mol.data(mol.type = "cpd", id.type = "KEGG COMPOUND accession", nmol = 50000, nexp = 2)
head(cpd.data)

Visualize gene data

Most of the SBGN-ML files from online resources have their unique ID types and it is different from the ID type in the omics data. If the ID types are different, we need to map the omics IDs to the node IDs in the SBGN-ML file. In the quick start example ("Adrenaline and noradrenaline biosynthesis" pathway), the SBGN-ML file uses pathwayCommons IDs for gene/protein nodes, whereas the omics dataset uses Entrez gene IDs. The function SBGNview can automatically map common ID types such as ENTREZ, UniProt etc. to nodes in our pre-generated SBGN-ML files as shown in the quick start example. We can also do it manually using function change.data.id, which is called by SBGNview to do ID mapping. Supprted ID type pairs can be found in data(mapped.ids). change.data.id uses pre-generated mapping tables or pathview to do the mapping. If the input-output ID type pair is not in data(mapped.ids) or can't be mapped bypathview, user needs to provide the mapping table explicitly using the "id.mapping.table" argument.

Let's change the IDs in the gene expression omics data.

gene.data <- gse16873
head(gene.data[,1:2])

gene.data <- change.data.id(data.input.id = gene.data,
                           input.type = "entrez",
                           output.type = "pathwayCommons",
                           cpd.or.gene = "gene",
                           sum.method = "sum"
                           )

head(gene.data[,1:2])

Now we run SBGNview, the main function to overlay omics data on SBGN map.

SBGNview.obj <- SBGNview(
              gene.data = gene.data,
              input.sbgn = "P00001",
              output.file = "test_output",
              gene.id.type = "pathwayCommons",
              output.formats =  c("svg")
    )
SBGNview.obj

By default SBGNview will generate a .svg file. Other formats can be added also. In this example, three additional files (pdf, ps, png) will be created in the same folder.

```rVisualization of gene expression data."} include_graphics("test_output_P00001.svg")

## Visualize both gene data and compound data


Here for demo purpose, we change the kegg compound IDs to pathwayCommons compound IDs. Although *SBGNview* can do this automatically (e.g. in the [quick start example](#quickStart)).


```r
cpd.data <- change.data.id(data.input.id = cpd.data,
                           input.type = "kegg.ligand",
                           output.type = "pathwayCommons",
                           cpd.or.gene = "compound",
                           sum.method = "sum"
                           )
head(cpd.data)

Now we can visualize both gene and compound data. In this example, we use the original gene expression data with "ENTREZ" IDs to show SBGNview's automatic ID mapping ability.

SBGNview.obj <- SBGNview(
                gene.data = gse16873,
                cpd.data = cpd.data,
                input.sbgn = "P00001",
                output.file = "test_output.gene.compound",
                gene.id.type = "entrez",
                cpd.id.type = "pathwayCommons",
                output.formats =  c("svg")
                )
SBGNview.obj

```rVisualization of both gene expression and compound abundance data."} include_graphics("test_output.gene.compound_P00001.svg")

## About *SBGNview* object {#sbgnviewObj}
**SBGNview** operates in a way similar to **ggplot2**:
The main function *SBGNview* returns a *SBGNview* object (similar to the *ggplot* object "p" returned by function *ggplot* in **ggplot2**), which contains all information needed to render SBGN graph, including output file path. Printing this object will render the graph and write output image files. *SBGNview* object can be further modified by several built-in functions to highlight nodes/edges/paths (e.g. *SBGNview.obj*+*highlight.nodes()*, similar to *p*+*geom_boxplot()* in **ggplot2**).

* These operations will generate plot files:
    + *SBGNview(...) + highlight.nodes(...)* or *SBGNview.obj + highlight.nodes(...)*

          How it works: The functions will return a *SBGNview* object to R console. The returned object is executed as a top-level R expression, thus will be implicitly printed using a *print.SBGNview* function in **SBGNview** package. For more details, please see the documentaion of function *print.SBGNview*.
    + run *SBGNview.obj* in R console

          The mechanism is the same as above. The object run in R console is implicitly printed.
    + *print(SBGNview.obj)*

          In this case the "print.SBGNview" function is run explicitly.
    + *for (i in 1:2) {print(SBGNview.obj)}*   

          Same as above: the "print.SBGNview" function is run explicitly.
* These commands will NOT generate plot files:
    + *SBGNview.obj = SBGNview(...)+highlight.nodes(...)*

          In this case, the assign operation "=" made the returned object invisible thus not printed
    + *for (i in 1:2) {SBGNview.obj}*

          In this case SBGNview.obj is no longer a top-level R expression thus won't be implicitly printed.

### Structure of *SBGNview* object
```r
result.one.sbgn <- SBGNview.obj$data[[1]]
names(result.one.sbgn)
glyphs <- result.one.sbgn$glyphs.list
arcs <- result.one.sbgn$arcs.list
str(glyphs[[1]])
str(arcs[[1]])

Change output file in a SBGNview object

We can change the output file using built-in function output.file:

output.file(SBGNview.obj)
output.file(SBGNview.obj) <- "test.change.output.file"
output.file(SBGNview.obj)
SBGNview.obj
output.file(SBGNview.obj) <- "test.print"
output.file(SBGNview.obj)
print(SBGNview.obj)

Try different layout for the same pathway.{#tryDifferentLayout}

If the default layout is not ideal, users have two options:

Modify the layout manually using tools like newt editor
Download pre-generated SBGN-ML files with different layout. https://github.com/datapplab/SBGN-ML.files/tree/master/data/SBGN

download.file("https://raw.githubusercontent.com/datapplab/SBGNhub/master/data/SBGN.with.stamp/pathwayCommons/http___identifiers.org_panther.pathway_P00001.1.sbgn",destfile = "P00001.new.layout.sbgn")
SBGNview(
          gene.data = gse16873, 
          gene.id.type = "entrez",
          input.sbgn = "P00001.new.layout.sbgn",
          sbgn.gene.id.type = "pathwayCommons",
          output.file = "test.different.layout", 
          output.formats =  c("svg")
          )

```rGraph with different layout."} include_graphics("test.different.layout_P00001.new.layout.sbgn.svg")

# Modify graph elements
It is useful to highlight interesting nodes, edges or paths in a pathway map. This can be done by modifying the *SBGNview* object, which contains all information needed to render a SBGN map.


## Built-in functions
Like ggplot2, the *SBGNview* object can be further modified by concatenating it with modification functions using binary operator *+* (see [quick start](#quickStart) for example). 

## Hightlight nodes

### Highlight all nodes
```r
highlight.all.nodes.sbgn.obj <-  SBGNview.obj +
        highlight.nodes( 
# Here we set argument "node.set" to select all nodes
            node.set = "all",
            stroke.width = 4, stroke.color = "green")
output.file(highlight.all.nodes.sbgn.obj) = "highlight.all.nodes"
print(highlight.all.nodes.sbgn.obj)

```rHighlight all nodes."} include_graphics("highlight.all.nodes_P00001.svg")

### Highlight nodes by class
```r
highlight.macromolecule.sbgn.obj <-  SBGNview.obj +
        highlight.nodes(
# Here we set argument "select.glyph.class" to select macromolecule nodes
            select.glyph.class = "macromolecule",
            stroke.width = 4, stroke.color = "green")
output.file(highlight.macromolecule.sbgn.obj) = "highlight.macromolecule"
print(highlight.macromolecule.sbgn.obj)

```rHighlight macromolecule nodes."} include_graphics("highlight.macromolecule_P00001.svg")

### Show node IDs instead of node labels
```r
highlight.all.nodes.sbgn.obj <-  SBGNview.obj +
        highlight.nodes( node.set = "(+-)-epinephrine", stroke.width = 4, stroke.color = "green",
# Here we set argument "show.glyph.id" to display node ID instead of the original label.
                         show.glyph.id = TRUE,
                        label.font.size = 10)
output.file(highlight.all.nodes.sbgn.obj) = "highlight.all.id.nodes"
print(highlight.all.nodes.sbgn.obj)

```rHighlight nodes using node IDs."} include_graphics("highlight.all.id.nodes_P00001.svg")

## Adjust node labels.

### Label position, font size, color, change labels
The function *highlight.nodes* also can be used to adjust labels. In this example, we move the label horizontally and vertically, change their color and font size.
```r
my.labels <- c("Tyr","epinephrine")
names(my.labels) <- c("tyrosine", "(+-)-epinephrine")
SBGNview.obj.adjust.label <-  SBGNview.obj +
        highlight.nodes( node.set = c("tyrosine", "(+-)-epinephrine"), stroke.width = 4, stroke.color = "green",
                         label.x.shift = 0,
# Labels are moved up a little bit                        
                         label.y.shift = -20,
                         label.color = "red",
                         label.font.size = 30,
                         label.spliting.string = "", 
# node labels can be customized by a named vector. The names of the vector is the IDs assigned to argument "node.set". Values of the vector are the new labels for display.                
                         labels = my.labels)
output.file(SBGNview.obj.adjust.label) <- "adjust.label"
print(SBGNview.obj.adjust.label)

```rModify node labels."} include_graphics("adjust.label_P00001.svg")

### Label text wrapping into multiple lines
Some nodes may have long labels thus overlap with surrounding graph elements. In this case we can set the parameter *label.spliting.string* to "any" so the label will be wrapped in multiple lines that fit the width of the node.
```r
SBGNview.obj.change.label.wrapping <-  SBGNview.obj +
        highlight.nodes( node.set = c("tyrosine", "(+-)-epinephrine"), stroke.width = 4, stroke.color = "green",
                         show.glyph.id = TRUE,
                         label.x.shift = 10,label.y.shift = 20,label.color = "red",
                         label.font.size = 10,label.spliting.string = "any")
output.file(SBGNview.obj.change.label.wrapping) = "change.label.wrapping"
print(SBGNview.obj.change.label.wrapping)

```rChange how labels are wrapped."} include_graphics("change.label.wrapping_P00001.svg")

## When one input ID maps to multiple nodes {#findNode}
In the example above, we saw that one input ID (e.g. "(+-)-epinephrine") can be mapped to multiple nodes in the graph. If we just want to focus on several particular ones, we can use function *highlight.nodes* to find the node IDs, which is unique to each node:
```r
test.show.glyph.id <- SBGNview.obj+
    highlight.nodes( node.set = c("tyrosine", "(+-)-epinephrine"), stroke.width = 4,
                     stroke.color = "green", show.glyph.id = TRUE,
                     label.x.shift = 10,label.y.shift = 20,label.color = "red",
                     label.font.size = 10,
# When "label.spliting.string" is set to a string that is not in the label (including an empty string ""), the label will not be wrapped into multiple lines.                     
                     label.spliting.string = "")
output.file(test.show.glyph.id) <- "test.show.glyph.id"
print(test.show.glyph.id)

```rShow node IDs of mapped nodes."} include_graphics("test.show.glyph.id_P00001.svg")

We can find the mapping between input IDs and node IDs:
```r
mapping <- change.ids(input.ids = c("tyrosine", "(+-)-epinephrine"),
           input.type = "CompoundName",
           output.type = "pathwayCommons",
           cpd.or.gene = "compound",
           limit.to.pathways = input.pathway.ids[1] )

mapping

We can pick two nodes to highlight and find a shortest path between them.

output.file(SBGNview.obj) <- "highlight.by.node.id"

SBGNview.obj+  highlight.nodes(node.set = c("tyrosine", "(+-)-epinephrine"),
                       stroke.width = 4, stroke.color = "red") + 
    highlight.path(from.node =  "SmallMolecule_96737c854fd379b17cb3b7715570b733",
                   to.node =   "SmallMolecule_7753c3822ee83d806156d21648c931e6",
                   node.set.id.type = "pathwayCommons",
                      from.node.color = "green",
                      to.node.color = "blue",
                      shortest.paths.cols = c("purple"),
                      input.node.stroke.width = 6,
                      path.node.stroke.width = 3,
                      path.node.color = "purple",
                      path.stroke.width = 5,
                      tip.size = 10)

```rHighlight nodes and shortest path using node IDs."} include_graphics("highlight.by.node.id_P00001.svg")

## Modify *SBGNview* object directly
More graph features can be controlled by directly modifing the *SBGNview* object.
```r
result.one.sbgn <- SBGNview.obj$data[[1]]
names(result.one.sbgn)
glyphs <- result.one.sbgn$glyphs.list
arcs <- result.one.sbgn$arcs.list
str(glyphs[[1]])
str(arcs[[1]])

Retrieve pathway related information{#extractInformation}

Extract node information

Node information can be extracted using function sbgn.nodes.

node.info <- sbgn.nodes(input.sbgn = c("P00001","P00002"),
                       output.gene.id.type = "SYMBOL",
                       output.cpd.id.type = "chebi",
                       species = "hsa"
                       )

The returned list contains information about all nodes in the SBGN-ML file.

head(node.info[[1]])

For example, the complex membership information can be retrieved by accessing the "complex" element. Macromolecules are represented by gene symbols. Simple chemicals are represented by ChEBI IDs (e.g. 33568). When there are multiple IDs of output type match the same node in SBGN-ML file, the target IDs are concatenated by "; ". In the following example, complex with ID "Complex_4e65cdd554d14679587b7822e6426705" has two members: 1. a protein (symbol Slc18A2 etc.) and 2. a simple chemical (ChEBI 33568)

ID mapping

SBGNview can automatically map common ID types to SBGN-ML glyphs in our pre-generated SBGN-ML files. Supported ID types can be accessed as follow:

data("mapped.ids")

Map between two types of IDs

Besides change.data.id which changes ID for omics data, SBGNview provides functions to map between different types IDs:

mapping <- change.ids(
  input.ids = c("tyrosine", "(+-)-epinephrine"),
  input.type = "CompoundName",
  output.type = "pathwayCommons",
  cpd.or.gene = "compound",
  limit.to.pathways = "P00001"
)

head(mapping)

mapping <- change.ids(
  input.ids = c("tyrosine", "(+-)-epinephrine"),
  input.type = "CompoundName",
  output.type = "chebi",
  cpd.or.gene = "compound"
)

head(mapping)

Re-use downloaded ID mapping tables

SBGNview has generated pairwise ID mapping tables (between various gene/compound ID types and pathway glyph IDs and pathway IDs) for the pre-collected SBGN-ML files. SBGNview automatically downloads these mapping tables into a folder specified by parameter "SBGNview.data.folder", if the file is not in that folder. Therefore, user can retain the downloaded files and specify "SBGNview.data.folder" to re-use the downloaded ID mapping files. The default SBGNview.data.folder is "SBGNview.tmp.data" in the working directory. In the following example, we set "SBGNview.tmp.data" so SBGNview doesn't need to download the ID mapping table again.

mapping <- change.ids(
  input.ids = c("tyrosine"),
  input.type = "CompoundName",
  output.type = "chebi",
  cpd.or.gene = "compound",
  SBGNview.data.folder = "./SBGNview.tmp.data"
)

head(mapping)

Extract molecule list from pathways {#extractList}

mol.list <- get.mol.list(
                        database = "metacrop"
                        ,mol.list.ID.type = "ENZYME"
                        ,org = "ath"
                        ,output.pathway.name = FALSE
                        ,truncate.name.length = 50
)

mol.list[[1]]

mol.list <- get.mol.list(
                 database = "pathwayCommons",
                 mol.list.ID.type = "ENTREZID",
                 org = "hsa"
)

mol.list[[1]]

mol.list <- get.mol.list(
                 database = "pathwayCommons",
                 mol.list.ID.type = "ENTREZID",
                 org = "mmu"
)

mol.list[[2]]

mol.list <- get.mol.list(
                 database = "MetaCyc",
                 mol.list.ID.type = "KO",
                 org = "eco"
)

mol.list[[2]]

mol.list <- get.mol.list(
                 database = "pathwayCommons",
                 mol.list.ID.type = "chebi",
                 cpd.or.gene = "compound"
)

mol.list[[2]]

Example using selected database

Use Reactome pathway database

is.reactome <- pathways.info[,"sub.database"]== "reactome"
reactome.ids <- pathways.info[is.reactome ,"pathway.id"]
SBGNview.obj <- SBGNview(
              gene.data = gse16873, 
              gene.id.type = "entrez",
              input.sbgn =  reactome.ids[1:2],
              output.file = "demo.reactome", 
              output.formats =  c("svg")
            )
SBGNview.obj

Use MetaCrop pathway database

is.metacrop <- pathways.info[,"sub.database"]== "MetaCrop"
metacrop.ids <- pathways.info[is.metacrop ,"pathway.id"]
SBGNview.obj <- SBGNview(
              gene.data = c(), 
              input.sbgn =  metacrop.ids[1:2],
              output.file = "demo.metacrop", 
              output.formats =  c("svg")
            )
SBGNview.obj

Test SBGN reference cards

download.sbgn.file(c("AF_Reference_Card.sbgn"
                 ,"PD_Reference_Card.sbgn"
                 ,"ER_Reference_Card.sbgn"
                 ))

SBGNview.obj <- SBGNview(
             gene.data = c()
             ,input.sbgn = c("AF_Reference_Card.sbgn"
                       ,"PD_Reference_Card.sbgn"
                       ,"ER_Reference_Card.sbgn"
                       )
             ,sbgn.gene.id.type ="glyph"

             ,output.file = "./test.refcards" 
             ,output.formats = c("pdf")
             ,font.size = 1
             ,logic.node.font.scale = 10
             ,status.node.font.scale = 10
           )
SBGNview.obj

FAQs

Color key

Turn off color key

# Not run!
SBGNview(
 key.pos = "none"
)

References

Session Info

sessionInfo()

chemokine/OmicsSBGN documentation built on June 27, 2019, 7:52 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

chemokine/OmicsSBGN
Overlay omics data onto SBGN pathway diagram

In chemokine/OmicsSBGN: Overlay omics data onto SBGN pathway diagram

About

Introduction

Installation

Prerequisites

Install SBGNview

Overview

A quick example {#quickStart}

Search for pathways by keywords {#searchPathways}

Different layout for the same pathway

User customized SBGN-ML file

Omics data

Gene expression data

Chemical compound data

Visualize gene data

Change output file in a SBGNview object

Try different layout for the same pathway.{#tryDifferentLayout}

Retrieve pathway related information{#extractInformation}

Extract node information

ID mapping

Map between two types of IDs

Re-use downloaded ID mapping tables

Extract molecule list from pathways {#extractList}

Example using selected database

Use Reactome pathway database

Use MetaCrop pathway database

Test SBGN reference cards

FAQs

Color key

Turn off color key

References

Session Info

R Package Documentation

Browse R Packages

We want your feedback!

chemokine/OmicsSBGN Overlay omics data onto SBGN pathway diagram

In chemokine/OmicsSBGN: Overlay omics data onto SBGN pathway diagram

About

Introduction

Installation

Prerequisites

Install SBGNview

Overview

A quick example {#quickStart}

Search for pathways by keywords {#searchPathways}

Different layout for the same pathway

User customized SBGN-ML file

Omics data

Gene expression data

Chemical compound data

Visualize gene data

Change output file in a SBGNview object

Try different layout for the same pathway.{#tryDifferentLayout}

Retrieve pathway related information{#extractInformation}

Extract node information

ID mapping

Map between two types of IDs

Re-use downloaded ID mapping tables

Extract molecule list from pathways {#extractList}

Example using selected database

Use Reactome pathway database

Use MetaCrop pathway database

Test SBGN reference cards

FAQs

Color key

Turn off color key

References

Session Info

R Package Documentation

Browse R Packages

We want your feedback!

chemokine/OmicsSBGN
Overlay omics data onto SBGN pathway diagram