parseKGML2DataFrame: Parse KGML file into a data frame

View source: R/parse.R

parseKGML2DataFrameR Documentation

Parse KGML file into a data frame

Description

This function extends the parseKGML2Graph function, by converting the resulting graph into a three-column data frame representing out-nodes (the from column in the data frame), in-nodes (to), types and subtypes of edges that connect them (type and subtype, respectively). It can be used, for example, for exporting KEGG pathway networks in plain text files.

Usage

parseKGML2DataFrame(file, reactions=FALSE,...)

Arguments

file

A KGML file

reactions

Logical, whether metabolic reactions should be parsed and returned as part of the data frame. Default:FALSE

...

Other parameters passed to KEGGpathway2Graph

Details

The out- and in-nodes are represented in the form of KEGG identifiers. For human EntrezIDs the function translateKEGGID2GeneID can be used.

Multile edges are supported: in case more than one subtypes of edges exist between two nodes, they are all listed in the resulting data frame.

Value

A four-column data frame, representing the graph structure: out-nodes (the from column), in-nodes (to), edge type (type) and subtype (subtype).

Author(s)

Jitao David Zhang

See Also

parseKGML2Graph, KEGGpathway2Graph and translateKEGGID2GeneID.

Examples

sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph")
gdf <- parseKGML2DataFrame(sfile)
head(gdf)
dim(gdf)

rfile <- system.file("extdata/hsa00020.xml",package="KEGGgraph")
dim(dfWr <- parseKGML2DataFrame(rfile, reactions=TRUE))
dim(dfWOr <- parseKGML2DataFrame(rfile, reactions=FALSE))
stopifnot(nrow(dfWr)>nrow(dfWOr))

## not expanding genes: only the KGML-specific identifiers are used then
## only for expert use
## NOT RUN
gdf.ne <- parseKGML2DataFrame(sfile, expandGenes=FALSE)
dim(gdf.ne)
head(gdf.ne)
## NOT RUN

Accio/KEGGgraph documentation built on Jan. 13, 2023, 1:03 p.m.