knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Biological pathway data integration has become a topic of interest in the past years. This interest originates essentially from the continuously increasing size of existing prior knowledge as well as from the many challenges scientists face when studying biological pathways. Multipath is a framework that aims at helping re-trace the use of specific pathway knowledge in specific publications, and easing the data integration of multiple pathway types and further influencing knowledge sources. Using Multipath, BioPax-encoded pathways can be parsed and embedded into multilayered graphs. Modifications can be applied to these graphs to generate different views. The package is implemented as a part of the Multipath Project directed by Dr. Frank Kramer .
Multipath depends on multiple packages. The packages are the following: UniProt.ws, dbparser, rBiopaxParser, mully, TCGAretriever, stringr, svMisc, uuid, dplyr, crayon Please make sure to install the packages UniProt.ws,rBiopaxParser and mully before using the package.
To install the UniProt.ws package:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("UniProt.ws")
To install the mully package:
require(devtools) install_github("frankkramer-lab/mully") library(mully)
To install the rBiopaxParser package:
require(devtools) install_github("frankkramer-lab/rBiopaxParser") library(rBiopaxParser)
require(devtools) install_github("frankkramer-lab/Multipath") library(Multipath)
Add a drug layer to a mully graph
This function is used to add a DrugBank layer to an existing mully model. The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a mully graph with the added drug layer
Example
g=mully("DrugBank",direct=T) data=loadDBXML(DrugBankFile) g=addDBLayer(g,data,c("DB00001","DB06605"))
Add OMIM Layer with its respective Nodes and Edges The function first extracts the proteins available in a Mully graph. Then, the proteins are queried from the UPKB database to check if they have a cross-reference to OMIM. The proteins, along with their returned OMIM ids, are then returned as a data frame, which is subsequently fed into another function to check if the returned OMIM ids also have a reference to the same protein. Finally, a new Disease layer is added to the Mully graph and edges are created between OMIM ids and proteins if a connection is proven to be true. The same process is repeated with Kegg genes to also create edges between the OMIM ids and the genes. Please note that the function should be preceded by by calling romim::set_key('KEY'). The KEY could be requested via OMIM's official website. The function needs the following arguments:
readBiopax(filepath)
Example
biopax=readBiopax("wnt.owl") pathwayID=listPathways(biopax)$id[1] g=Multipath::pathway2Mully(biopax,pathwayID) g=addDiseaseLayer(g,biopax)
Add KEGG Gene Layer with its respective Nodes and Edges The function first extract the proteins available on a mully graph. Then the protein are queried from UPKB database to check if they have a cross-reference to KEGG. Afterwards the proteins along with their returned genes are returned into a data frame that is then feed into another function to check if the returned genes also have a reference to that same protein. After all, a new genes layer is added to the Mully graph and edges are added between genes and protein if a connection is proven true. The function needs the following arguments:
readBiopax(filepath)
Example
biopax=readBiopax("wnt.owl") pathwayID=listPathways(biopax)$id[1] g=Multipath::pathway2Mully(biopax,pathwayID) g=addGenesLayer(g,biopax)
Track a modification of a graph
The function saves a modification applied to a mully graph. It applies the step to the graph and saves the modification step in the pathwayView Object. Not all of the arguments are mandatory, they depend on the step that has to be applied. The function needs the following arguments:
The function returns the view with the added step. Example
g=mully:::demo() view=pathwayView(g,"View1") view=addStep(view,"remove","layer","disease")
Add a protein layer to a mully graph
This function is used to add a UniProt protein layer to an existing mully model. The function needs the following arguments: - g - The mully graph - up - The UniProt.ws Object - proteinList - The list of UniProt Ids of the proteins to be added - col - The list of attributes associated to the UniProtKB Entries to be retrieved
The function returns the mully graph with the added UniProt layer The function should be preceded by UniProt.ws()
to get the UniProt.ws Object
Example
up=UniProt.ws() g=mully("UniProt") g=addUPKBLayer(g,up,proteinList=c("P02747","P00734","P07204"),col=c("UNIPROTKB","PROTEIN-NAMES"))
Download Reactome Pathways in BioPAX level 2 and 3
This function is used to download one or a list of pathways, encoded in BioPAX level 2 or 3. The function needs the following arguments:
The function returns the path to the directory in which the files are downloaded.
Example
downloadPathway(c("R-HSA-195721","R-HSA-9609507"),biopaxLevel=3,overwrite=T)
Get all proteins' entries from UniProt
The function is used to fetch all protein entries from UniProt. The function needs the following arguments: - up - The UniProt.ws Object
The function returns a dataframe containing the Protein's entries with the ID and Name.
Should be preceded by UniProt.ws()
to get the UniProt.ws Object
Example
up=UniProt.ws() allProteins=getAllUPKB(up)
Get the Carriers Protein Targets of given DrugBank drugs
Protein Targeted by Drugs are divided in DrugBank into 4 types: Targets, Enzymes, Carriers and Transporters. This function is used to extract the carriers from the dataframe containing the information on the drugs parsed from the DrugBank XML File.
The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a dataframe containing all information on the carriers targeted by the given drug list.
Example
data=loadDBXML(DrugBankFilePath) getDBCarriers(data,"DB00001")
Get DrugBank drug entry
This function extracts infromation on one or a list of Drugs from the dataframe parsed from the DrugBank XML file. The function needs the following arguments:
loadDBXML(DrugBankFile)
This function returns a dataframe containing the DrugBank entry with its information
Example
data=loadDBXML(DrugBankFilePath) getDBDrug(data,"DB00001")
Get DrugBank Drug to Drug Interactions This function is used to extract Drug Interactions from the dataframe containing the information on the Drug in DrugBank, parsed from the downloaded XML File.
The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a dataframe containing the DrugBank interactions in which the given drug is involved Example
data=loadDBXML(DrugBankFilePath) getDBDrugInteractions(data,"DB06605")
Get the Enzyme Protein Targets of given DrugBank drugs
Protein Targeted by Drugs are divided in DrugBank into 4 types: Targets, Enzymes, Carriers and Transporters. This function is used to extract the enzymes from the dataframe containing the information on the drugs parsed from the DrugBank XML File.
The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a dataframe containing all information on the enzymes targeted by the given drug list.
Example
data=loadDBXML(DrugBankFilePath) getDBEnzymes(data,"DB00001")
Get the Target Protein Targets of given DrugBank drugs
Protein Targeted by Drugs are divided in DrugBank into 4 types: Targets, Enzymes, Carriers and Transporters. This function is used to extract the targets from the dataframe containing the information on the drugs parsed from the DrugBank XML File.
The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a dataframe containing all information on the targets of the given drug list.
Example
data=loadDBXML(DrugBankFilePath) getDBTargets(data,"DB00001")
Get the Transporters Protein Targets of given DrugBank drugs
Protein Targeted by Drugs are divided in DrugBank into 4 types: Targets, Enzymes, Carriers and Transporters. This function is used to extract the transporters from the dataframe containing the information on the drugs parsed from the DrugBank XML File.
The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a dataframe containing all information on the transporters targeted by the given drug list.
Example
data=loadDBXML(DrugBankFilePath) getDBTransporters(data,"DB00001")
Get DrugBank Drugs to UniProt Proteins Relations from DrugBank
This function is used to extract Drug Targets from the dataframe containing the information on the drugs parsed from the DrugBank XML File. It merges the targets returned by 4 functions: enzymes, targets, transporters and carriers. The function needs the following arguments:
loadDBXML(DrugBankFile)
The function returns a dataframe containing the connections between DrugBank drugs and UniProt proteins retrieved from DrugBank.
Example
data=loadDBXML(DrugBankFilePath) getDBtoUPKB(data,c("DB00001","DB00002","DB00006"),c("P02747","P00734","P07204","P05164"))
Get A Gene using KEGGREST package The function is used to query the KEGGGenes database and receive all available information regarding a specific gene. The list of genes should be provided. Since KEGGGenes API only allows for 10 entries per query, this function split the input into a lists of 10 elements. The function needs the following arguments:
Example
geneList=c("hsa:122706","hsa:4221","hsa:8312") genes = getKeggGene(geneList)
Get KEGG Genes to a cross-reference from KEGG The function is used to query gene entries from KEGG Genes that have a cross reference to another database which include either "UniProtKB" or "OMIM" or "Ensemble". The function needs the following arguments:
Example
getKEGGtoDATABASE("UniProt",c("hsa:122706","hsa:4221","hsa:8312"))
Get KEGG Genes that has OMIM as a cross-reference from KEGGGenes The function is used to query gene entries from KEGG Genes that have a cross reference to "OMIM" The function needs the following arguments:
Example
getKEGGtoOMIM(c("hsa:122706","hsa:4221","hsa:8312"))
Gets genes and proteins that are referenced to each other The function relies on two other functions: getRelatedGenes(g,biopax) and getKEGGtoDatabase(dbName,geneList) to find and return genes and proteins that are related to each other as a dataframe. The function needs the following arguments:
Example
up = UniProt.ws() biopax=readBiopax("wnt.owl") pathwayID=listPathways(biopax)$id[1] g=Multipath::pathway2Mully(biopax,pathwayID) g=addGenesLayer(g,biopax) getUPKBtoKEGG(g, biopax)
Gets the genes and diseases that are referenced to each other This function finds the relation between the available Genes that are extracted from the mully graph and their respective "OMIM" cross reference. The function depends on two functions: getKEGGtoOMIM(geneList) and getOmimToKEGG(omimIds). A data frame is returned with all information about the genes and their cross reference "OMIM". Please note that the function should be preceded by by calling romim::set_key('KEY'). The KEY could be requested via OMIM's official website. The function needs the following arguments:
Example
up = UniProt.ws() biopax=readBiopax("wnt.owl") pathwayID=listPathways(biopax)$id[1] g=Multipath::pathway2Mully(biopax,pathwayID) g=addGenesLayer(g,biopax) getKeggOmimRelation(g,biopax)
Get Omim to UniprotKB relations from OMIM This function is used to query OMIM entries that have a cross-reference to "UniProtKB" proteins. Please note that the function should be preceded by by calling romim::set_key('KEY'). The KEY could be requested via OMIM's official website. The function needs the following arguments:
Example
getOmimToUPKB(c("611137", "613733", "603816"))
Get Omim to KEGG Genes relations from OMIM This function is used to query OMIM entries that have a cross-reference to "KEGG Genes". Please note that the function should be preceded by by calling romim::set_key('KEY'). The KEY could be requested via OMIM's official website. The function needs the following arguments:
Example
getOmimToKEGG(c("611137", "613733", "603816"))
Get proteins that has a reference to KEGG Genes from a mully graph and a biopax object The function first retrieves all the proteins available on a protein layer in a mully graph. The proteins are then queried using the function getUPKBInfo() to find the genes that are cross referenced to the proteins. The function returns a data frame containing : Uniprot ID, KEGG ID, the graph's internal ID and the source. The function needs the following arguments:
Example
up = UniProt.ws() biopax=readBiopax("wnt.owl") pathwayID=listPathways(biopax)$id[1] g=Multipath::pathway2Mully(biopax,pathwayID) g=addGenesLayer(g,biopax) getRelatedGenes(g, biopax)
Get internal pathway ID in a BioPAX file
This function is used to get the internal ID of a pathway in a parsed BioPAX object.
A BioPAX file can contain multiple pathways, indexed internally using ID starting with "Pathway" followed by the number of the pathway.
Each pathway in the file has a Reactome and an internal ID.
The latter can be extracted using this function.
This should be preceded by readBiopax(filepath)
to obtain the biopax object The function needs the following arguments:
The function returns the internal ID of the pathway in the parsed BioPAX object.
Example
biopax=readBiopax("pi3k.owl") id=getPathwayID(biopax,"R-HSA-167057") pi3kmully=pathway2mully(biopax,id)
Get Protein and Drugs relations from UniProt and DrugBank
The function is used to obtain drug targets from UniProt and DrugBank.
It combines the returned relations from both functions getDBtoUPKB
and getUPKBtoDB
.
The function needs the following arguments:
loadDBXML(DrugBankFile)
-proteinList - The list of UniProt Ids of the proteinsThe function returns a dataframe containing the connections between DrugBank drugs and UniProt proteins retrieved from DrugBank and UniProt. The function should be preceded by:
UniProt.ws()
to get the UniProt.ws ObjectloadDBXML(DrugBankFile)
to get the argument dataExample
up=UniProt.ws() data=loadDBXML(DrugBankFilePath) relations=getUPKBDBRelations(up,data,proteinList=c("P02747","P07204"),drugList=c("DB00001","DB00006"))
Get Proteins from UniProtKB
The function is used to fetch information on a list of protein entries from UniProt. The function needs the following arguments: - up - The UniProt.ws Object - proteins - The list of UniProtKB Proteins ID to be retrieved - col - The list of attributes associated to the UniProtKB Entries to be retrieved
The function returns a dataframe containing the protein entries with the selected attributes.
To get the list of possible columns, you can call columns(UniProt.ws())
.
The function should be preceded by UniProt.ws()
to get the UniProt.ws Object.
Example
up <- UniProt.ws() getUPKBInfo(up,c("Q6ZS62","P14384","P40259"),c("PROTEIN-NAMES","DRUGBANK","GO","REACTOME"))
Get the interactions of given proteins from UniProt
The function is used to fetch interactions between proteins from the UniProt Database. The function needs the following arguments:
The function returns a dataframe containing the interactions between the given proteins.
The function should be preceded by UniProt.ws()
to get the UniProt.ws Object.
Example
up=UniProt.ws() interactions=getUPKBInteractions(up,c("P02747","P07204","P00734"))
Get Proteins from UniProtKB that has KEGG as a cross reference
The function is used to fetch information on a list of protein entries from UniProt.It returns a dataframe showing the protein and its respective KEGG id if it could be located as a cross reference. The function needs the following arguments:
The function should be preceded by UniProt.ws()
to get the UniProt.ws Object.
Example
up = UniProt.ws() proteinList = c("P02747","P00734","P07204","A0A0S2Z4R0","O15169") geneList=c("hsa:122706","hsa:4221","hsa:8312") getUPKBtoKEGG(up,geneList,proteinList)
Get proteins that has a reference to OMIM from a mully graph and a biopax object The function first retrieves all the proteins available on a protein layer in a mully graph. The proteins are then queried using the function getUPKBInfo() to find the OMIM ids that are cross referenced to the proteins. The function returns a data frame containing : Uniprot ID, OMIM ID, the protein's internal ID and the source. Please note that the function should be preceded by by calling romim::set_key('KEY'). The KEY could be requested via OMIM's official website. The function needs the following arguments:
biopax=readBiopax("wnt.owl") pathwayID=listPathways(biopax)$id[1] g=Multipath::pathway2Mully(biopax,pathwayID) g=getUPKBRelatedDiseases(g,biopax)
Get UniProt Proteins to DrugBank Drugs relations from UniProt
This function is used to fetch relations between a list of proteins and a list of drugs from the UniProt Database. The function needs the following arguments:
The function returns a dataframe containing the connections between UniProt proteins and DrugBank drugs retrieved from UniProt.
The function should be preceded by UniProt.ws()
to get the UniProt.ws Object.
Example
up=UniProt.ws() getUPKBtoDB(up,c("P02747","P00734","P07204"),c("DB00001","DB00002"))
Interaction between KEGG genes and other databases Get interactions between a KEGG gene input and all the databases - returns NA when there is no cross reference to other databases. The function should be preceded by transformKeggData(list_keggGet, list_Get) The function needs the following arguments:
Example
geneList=c("hsa:122706","hsa:4221","hsa:8312") genes = getKeggGene(geneList) genesinteraction = interactionKegg(genes)
Load DrugBank XML file
This function is used to read and parse the file downloaded from the DrugBank Database containing the complete information on the drug entries. The function needs the following argument:
This function returns a dataframe containing the parsed information from DrugBank. This can be used to extract any additional information on the DrugBank entries
This function should be called before using any function to query the DrugBank database. Since the parsing of DrugBank takes time, this function should only be called once.
Example
data=loadDBXML(DrugBankFilePath)
Generate Multipath Graph from General Data
This function is used to generate a mully graph from a list of drugs and proteins. The function creates a multilayered graph with a drug and protein layer, and adds the inter- and intractions to it. The function needs the following arguments:
Uniprot.ws()
objectloadDBXML(DrugBankFile)
The function returns a mully graph with the added data. The function should be preceded by:
UniProt.ws()
to get the UniProt.ws ObjectloadDBXML(DrugBankFile)
to get the argument dataExample
up=UniProt.ws() data=loadDBXML(DrugBankFilePath) g=multipath(up=up,proteinList=c("P02747","P05164"),data=data,drugList=c("DB00001","DB00006"))
Build a mully graph from a given pathway This function builds a multilayered mully graph of a BioPAX encoded pathway.
To run this function, the user needs to parse the file.
It should be preceded by readBiopax(filepath)
to obtain the biopax object.
The function needs the following arguments:
readBiopax(filepath)
getPathwayID(biopax,reactomeID)
can be calledThe function returns a mully graph built from the given pathway.
Example
biopax=readBiopax(pi3k.owl) pi3kmully=pathway2mully(biopax,"pathway1")
Create an empty view
The function is used to create a pathwayView in order to track the modifications applied to a mully graph. The ocject pathwayView contains different information on the View, including the timestamp of creation and last modification, the original and final version of the graph, and the dataframe containing the modification steps. The function needs the following arguments:
The function returns an empty pathwayView Object.
Example
view=pathwayView(mully("myMully",T),"View1")
Print Function
The function is used to print the pathwayView Object. The function needs the following arguments:
Simplify the dataframe's structure The function simplifies the interaction between a gene input and other databases and shows results in 4 columns (c1 = entry number in KEGG c2= name in other database c3= other database's name c4= organism name as attribute) The function needs the following arguments:
Example
geneList=c("hsa:122706","hsa:4221","hsa:8312") genes = getKeggGene(geneList) genesinteraction = interactionKegg(genes) genesinteractionsimplified = simplifyInteractionKegg(genesinteraction)
Transform data retrieved from KEGGGenes The function translate the output of getKeggGene() into a dataframe. The function needs the following arguments:
Example
geneList=c("hsa:122706","hsa:4221","hsa:8312") genes = getKeggGene(geneList)
Undo a modification step in a view
The function reverses changes applied to a mully graph, saved in a pathwayView Object. The function needs the following arguments:
The function returns The view with the undone modifications.
Demo function for Wnt Pathway Views The function is a demo function that create a pathway mully graph from a BioPAX encoded file of the Signaling by Wnt Pathway. The function reads and parses the file, creates the mully graph, and generates 3 different views from the graph by deleting the RNA, Complex, and Physical Entity Layers. The function needs the following arguments:
Example
wntpathway(wnt_reactome.owl)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.