PomaEnrichment: Enrichment Analysis
In pcastellanoescuder/POMA_package: Tools for Omics Data Analysis

PomaEnrichment

R Documentation

Enrichment Analysis

Description

PomaEnrichment performs enrichment analysis on a set of query gene symbols using specified methods and gene set collections. It allows for the analysis of over-representation (ORA) or gene set enrichment (GSEA) in various model organisms.

Usage

PomaEnrichment(
  genes,
  method = "ora",
  organism = "Homo sapiens",
  collection = "C5",
  universe = NULL,
  rank = NULL,
  pval_cutoff = 0.05,
  fdr_cutoff = 0.1,
  min_size = 2,
  max_size = if (method == "gsea") {
     length(genes) - 1
 } else {
     NULL
 },
  max_genes = 10
)

Arguments

`genes`	Character vector. Set of query gene symbols.
`method`	Character. Enrichment method. Options are: 'ora' (simple over-representation analysis based on hypergeometric test) and 'gsea' (gene set enrichment analysis on a ranked list of genes).
`organism`	Character. Indicates the model organism name. Default is 'Homo sapiens'. Other options are: 'Anolis carolinensis', 'Bos taurus', 'Caenorhabditis elegans', 'Canis lupus familiaris', 'Danio rerio', 'Drosophila melanogaster', 'Equus caballus', 'Felis catus', 'Gallus gallus', 'Macaca mulatta', 'Monodelphis domestica', 'Mus musculus', 'Ornithorhynchus anatinus', 'Pan troglodytes', 'Rattus norvegicus', 'Saccharomyces cerevisiae', 'Schizosaccharomyces pombe 972h-', 'Sus scrofa', 'Xenopus tropicalis'. See `msigdbr::msigdbr_show_species()`.
`collection`	Character. Indicates the gene set collection. Default is 'C5' (Gene Ontology gene sets). Other options are: 'C1' (positional gene sets), 'C2' (curated gene sets), 'C3' (regulatory target gene sets), 'C4' (computational gene sets), 'C6' (oncogenic signature gene sets), 'C7' (immunologic signature gene sets), 'C8' (cell type signature gene sets), 'H' (Hallmark gene sets). See `msigdbr::msigdbr_collections()`.
`universe`	Character vector. A universe from which 'genes' were selected.
`rank`	Numeric vector. Ranking factor to sort genes for GSEA (e.g., logFC, -log10(p-value), etc).
`pval_cutoff`	Numeric. Raw p-value cutoff on enrichment tests to report.
`fdr_cutoff`	Numeric. Adjusted p-value cutoff on enrichment tests to report.
`min_size`	Numeric. Minimal size of a gene set to test. All pathways below the threshold are excluded.
`max_size`	Numeric. Maximal size of a gene set to test. All pathways above the threshold are excluded.
`max_genes`	Numeric. The number of genes to retain from the `overlap_genes` or `leading_edge` columns. If `max_genes` is greater than the number of genes available, all genes are retained.

Value

A tibble with the enriched gene sets.

Author(s)

Pol Castellano-Escuder

Examples

# Example genes
genes <- c("BRCA1", "TP53", "EGFR", "MYC", "PTEN")

# Perform ORA on Gene Ontology (C5) gene sets for Homo sapiens
PomaEnrichment(
  genes = genes,
  method = "ora",
  organism = "Homo sapiens",
  collection = "C5",
  pval_cutoff = 0.05,
  fdr_cutoff = 0.1,
  min_size = 10,
  max_size = 500)

# Example genes with ranking factors (e.g., logFC values)
genes <- c("Actb", "Gapdh", "Cdkn1a", "Cd44", "Pten")
rank <- c(2.5, -1.8, 3.1, -2.2, 1.7)

# Perform GSEA on Hallmark (H) gene sets for Mus musculus
PomaEnrichment(
  genes = genes,
  method = "gsea",
  organism = "Mus musculus",
  collection = "H",
  rank = rank,
  pval_cutoff = 0.05,
  fdr_cutoff = 0.25,
  min_size = 15,
  max_size = 500)

pcastellanoescuder/POMA_package documentation built on Nov. 28, 2024, 1:23 p.m.