pathEnrich: pathEnrich

View source: R/pathEnrich.R

pathEnrichR Documentation

pathEnrich

Description

This function takes the list generated in get_kegg as well as a vector of NCBI (ENTREZ) geneIDs, and identifies significantly enriched KEGG pathways using a Fisher's Exact Test. Unadjusted p-values as well as FDR corrected p-values are calculated.

Usage

pathEnrich(gk_obj, gene_list, method = "BH", cutoff = 0.05, N = 2)

## S3 method for class 'pathEnrich'
print(x, ...)

## S3 method for class 'pathEnrich'
summary(object, ...)

Arguments

gk_obj

list. Object genrated from get_kegg, or a list containing the output generated from a past get_kegg call. Names of the list must match those defined in get_kegg. If the user wishes to use an older version of data generated by get_kegg, they must first load that data and put it in a named list that matches the names given in the list generated by get_kegg.

gene_list

Vector. Vector of NCBI (ENTREZ) geneIDs.

method

Character. Character string telling diffEnrich which method to use for multiple testing correction. Available methods are those provided by p.adjust, and the default is "BH", or False Discovery Rate (FDR).

cutoff

Numeric. The p-value threshold to be used as the cutoff when determining statistical significance, and used to filter list of significant pathways.

N

Numeric. The number of genes from the gene list that must be present in a KEGG pathway in order for that pathway to be retained and tested.

x

object of class pathEnrich

...

Unused

object

object of class pathEnrich

Details

This function may not always use the complete list of genes provided by the user. Specifically, it will only use the genes from the list provided that are also in the most current species list pulled from the KEGG REST API, or from the older data KEGG loaded by the user. The 'cutoff' only filters the list of pathways provided in the 'sig_paths' list item. It is not used to filter the 'enrich_table' list object. S3 generic functions for print and summary are provided. The print function prints the results table as a tibble, and the summary function returns the number of pathways that reached statistical significance, as well as their descriptions, the number of genes used from the KEGG data base, the KEGG species, and the method used for multiple testing correction, and the p-value cutoff required for reaching statistical significance.

Value

A list object of class pathEnrich that contains 6 items:

species

The species used in enrichment

padj

The method used to correct for multiple testing

sig_paths

The KEGG pathways the reached statistical significance after multiple testing correction.

cutoff

The p-value threshold to be used as the cutoff when determining statistical significance, and used to filter final results data set.

N

The number of genes from the gene list that must be present in a KEGG pathway in order for that pathway to be retained and tested.

enrich_table

A data frame that summarizes the results of the pathway analysis and contains the following variables:

KEGG_PATHWAY_ID

KEGG Pathway Identifier

KEGG_PATHWAY_description

Description of KEGG Pathway (provided by KEGG)

KEGG_PATHWAY_cnt

Number of Genes in KEGG Pathway

KEGG_PATHWAY_in_list

Number of Genes from gene list in KEGG Pathway

KEGG_DATABASE_cnt

Number of Genes in KEGG Database

KEGG_DATABASE_in_list

Number of Genes from gene list in KEGG Database

expected

Expected number of genes from list to be in KEGG pathway by chance (i.e., not enriched)

enrich_p

P-value for enrichment of list genes related to KEGG pathway

p_adj

False Discovery Rate (Benjamini and Hochberg) to account for multiple testing across KEGG pathways

fold_enrichment

KEGG_PATHWAY_in_list/expected

Examples


list1_pe <- pathEnrich(gk_obj = kegg, gene_list = geneLists$list1)
## Not run: 
list2_pe <- pathEnrich(gk_obj = kegg, gene_list = geneLists$list2, method = 'none', N = 4)

## End(Not run)


diffEnrich documentation built on June 28, 2022, 1:08 a.m.