To perform a gene set enrichment analysis on KEGG pathways, it is necessary to build up the gene set database in a format that the GSEA method can read. Parsing a list of gene sets from a flat text file in GMT format. This function performs the necessary steps, including the retrieval of the participating gene IDs for each pathway and the conversion to GMT format.
1 2 3
Either a list of
Gene set file in GMT format. See details.
The GMT (Gene Matrix Transposed) file format is a tab delimited file format that describes gene sets. In the GMT format, each row represents a gene set. Each gene set is described by a name, a description, and the genes in the gene set. See references.
A list of gene sets (vectors of gene IDs).
Ludwig Geistlinger <[email protected]>
KEGG Organism code http://www.genome.jp/kegg/catalog/org_list.html
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# WAYS TO DEFINE GENE SETS ACCORDING TO HUMAN KEGG PATHWAYS # (1) from scratch: via organism ID gs <- get.kegg.genesets("hsa") # (2) extract from pathways # download human pathways via: # pwys <- download.kegg.pathways("hsa") pwys <- system.file("extdata/hsa_kegg_pwys.zip", package="EnrichmentBrowser") gs <- get.kegg.genesets(pwys) # (3) parsing gene sets from GMT gmt.file <- system.file("extdata/hsa_kegg_gs.gmt", package="EnrichmentBrowser") gs <- parse.genesets.from.GMT(gmt.file)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.