parseKGML2DataFrame: Parse KGML file into a data frame
In Accio/KEGGgraph: KEGGgraph: A graph approach to KEGG PATHWAY in R and Bioconductor

parseKGML2DataFrame

R Documentation

Parse KGML file into a data frame

Description

This function extends the parseKGML2Graph function, by converting the resulting graph into a three-column data frame representing out-nodes (the from column in the data frame), in-nodes (to), types and subtypes of edges that connect them (type and subtype, respectively). It can be used, for example, for exporting KEGG pathway networks in plain text files.

Usage

parseKGML2DataFrame(file, reactions=FALSE,...)

Arguments

`file`	A KGML file
`reactions`	Logical, whether metabolic reactions should be parsed and returned as part of the data frame. Default:`FALSE`
`...`	Other parameters passed to `KEGGpathway2Graph`

Details

The out- and in-nodes are represented in the form of KEGG identifiers. For human EntrezIDs the function translateKEGGID2GeneID can be used.

Multile edges are supported: in case more than one subtypes of edges exist between two nodes, they are all listed in the resulting data frame.

Value

A four-column data frame, representing the graph structure: out-nodes (the from column), in-nodes (to), edge type (type) and subtype (subtype).

Author(s)

Jitao David Zhang

Examples

sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph")
gdf <- parseKGML2DataFrame(sfile)
head(gdf)
dim(gdf)

rfile <- system.file("extdata/hsa00020.xml",package="KEGGgraph")
dim(dfWr <- parseKGML2DataFrame(rfile, reactions=TRUE))
dim(dfWOr <- parseKGML2DataFrame(rfile, reactions=FALSE))
stopifnot(nrow(dfWr)>nrow(dfWOr))

## not expanding genes: only the KGML-specific identifiers are used then
## only for expert use
## NOT RUN
gdf.ne <- parseKGML2DataFrame(sfile, expandGenes=FALSE)
dim(gdf.ne)
head(gdf.ne)
## NOT RUN

Accio/KEGGgraph documentation built on Jan. 13, 2023, 1:03 p.m.