PomaEnrichment: Enrichment Analysis

View source: R/PomaEnrichment.R

PomaEnrichmentR Documentation

Enrichment Analysis

Description

PomaEnrichment performs enrichment analysis on a set of query gene symbols using specified methods and gene set collections. It allows for the analysis of over-representation (ORA) or gene set enrichment (GSEA) in various model organisms.

Usage

PomaEnrichment(
  genes,
  method = "ora",
  organism = "Homo sapiens",
  collection = "C5",
  universe = NULL,
  rank = NULL,
  pval_cutoff = 0.05,
  fdr_cutoff = 0.1,
  min_size = 2,
  max_size = if (method == "gsea") {
     length(genes) - 1
 } else {
     NULL
 },
  max_genes = 10
)

Arguments

genes

Character vector. Set of query gene symbols.

method

Character. Enrichment method. Options are: 'ora' (simple over-representation analysis based on hypergeometric test) and 'gsea' (gene set enrichment analysis on a ranked list of genes).

organism

Character. Indicates the model organism name. Default is 'Homo sapiens'. Other options are: 'Anolis carolinensis', 'Bos taurus', 'Caenorhabditis elegans', 'Canis lupus familiaris', 'Danio rerio', 'Drosophila melanogaster', 'Equus caballus', 'Felis catus', 'Gallus gallus', 'Macaca mulatta', 'Monodelphis domestica', 'Mus musculus', 'Ornithorhynchus anatinus', 'Pan troglodytes', 'Rattus norvegicus', 'Saccharomyces cerevisiae', 'Schizosaccharomyces pombe 972h-', 'Sus scrofa', 'Xenopus tropicalis'. See msigdbr::msigdbr_show_species().

collection

Character. Indicates the gene set collection. Default is 'C5' (Gene Ontology gene sets). Other options are: 'C1' (positional gene sets), 'C2' (curated gene sets), 'C3' (regulatory target gene sets), 'C4' (computational gene sets), 'C6' (oncogenic signature gene sets), 'C7' (immunologic signature gene sets), 'C8' (cell type signature gene sets), 'H' (Hallmark gene sets). See msigdbr::msigdbr_collections().

universe

Character vector. A universe from which 'genes' were selected.

rank

Numeric vector. Ranking factor to sort genes for GSEA (e.g., logFC, -log10(p-value), etc).

pval_cutoff

Numeric. Raw p-value cutoff on enrichment tests to report.

fdr_cutoff

Numeric. Adjusted p-value cutoff on enrichment tests to report.

min_size

Numeric. Minimal size of a gene set to test. All pathways below the threshold are excluded.

max_size

Numeric. Maximal size of a gene set to test. All pathways above the threshold are excluded.

max_genes

Numeric. The number of genes to retain from the overlap_genes or leading_edge columns. If max_genes is greater than the number of genes available, all genes are retained.

Value

A tibble with the enriched gene sets.

Author(s)

Pol Castellano-Escuder

Examples

# Example genes
genes <- c("BRCA1", "TP53", "EGFR", "MYC", "PTEN")

# Perform ORA on Gene Ontology (C5) gene sets for Homo sapiens
PomaEnrichment(
  genes = genes,
  method = "ora",
  organism = "Homo sapiens",
  collection = "C5",
  pval_cutoff = 0.05,
  fdr_cutoff = 0.1,
  min_size = 10,
  max_size = 500)

# Example genes with ranking factors (e.g., logFC values)
genes <- c("Actb", "Gapdh", "Cdkn1a", "Cd44", "Pten")
rank <- c(2.5, -1.8, 3.1, -2.2, 1.7)

# Perform GSEA on Hallmark (H) gene sets for Mus musculus
PomaEnrichment(
  genes = genes,
  method = "gsea",
  organism = "Mus musculus",
  collection = "H",
  rank = rank,
  pval_cutoff = 0.05,
  fdr_cutoff = 0.25,
  min_size = 15,
  max_size = 500)

pcastellanoescuder/POMA documentation built on Nov. 28, 2024, 1:21 p.m.