Definition of gene sets according to KEGG pathways for a specified organism

Description

To perform a gene set enrichment analysis on KEGG pathways, it is necessary to build up the gene set database in a format that the GSEA method can read. Parsing a list of gene sets from a flat text file in GMT format. This function performs the necessary steps, including the retrieval of the participating gene IDs for each pathway and the conversion to GMT format.

Usage

1
2
3
    get.kegg.genesets( pwys, gmt.file = NULL )

    parse.genesets.from.GMT( gmt.file )

Arguments

pwys

Either a list of KEGGPathway objects or an absolute file path of a zip compressed archive of pathway xml files in KGML format. Alternatively, an organism in KEGG three letter code, e.g. ‘hsa’ for ‘Homo sapiens’.

gmt.file

Gene set file in GMT format. See details.

Details

The GMT (Gene Matrix Transposed) file format is a tab delimited file format that describes gene sets. In the GMT format, each row represents a gene set. Each gene set is described by a name, a description, and the genes in the gene set. See references.

Value

A list of gene sets (vectors of gene IDs).

Author(s)

Ludwig Geistlinger <Ludwig.Geistlinger@bio.ifi.lmu.de>

References

GMT file format http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats

KEGG Organism code http://www.genome.jp/kegg/catalog/org_list.html

See Also

keggList, keggLink, KEGGPathway-class, parseKGML

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
    # WAYS TO DEFINE GENE SETS ACCORDING TO HUMAN KEGG PATHWAYS

    # (1) from scratch: via organism ID 
    
    gs <- get.kegg.genesets("hsa")
    

    # (2) extract from pathways
    # download human pathways via: 
    # pwys <- download.kegg.pathways("hsa")
    pwys <- system.file("extdata/hsa_kegg_pwys.zip", package="EnrichmentBrowser")
    gs <- get.kegg.genesets(pwys)

    # (3) parsing gene sets from GMT
    gmt.file <- system.file("extdata/hsa_kegg_gs.gmt", package="EnrichmentBrowser")
    gs <- parse.genesets.from.GMT(gmt.file)