run.topGO.meta: Run JO's topGO pipeline

View source: R/Utilities.R

run.topGO.metaR Documentation

Run JO's topGO pipeline

Description

Tests for functional enrichment in gene-categories of interest.

Usage

run.topGO.meta(
  mydf = "mydf",
  geneID2GO = "Pfal_geneID2GO_curated",
  pval = 0.05
)

Arguments

mydf

data frame with geneIDs in column 1, and interest-category classifications in column 2.

geneID2GO

A list of named vectors of GO IDs–one vector of GO-terms for each geneID.

pval

pvalue threshold for significance. Defaults to 0.05.

Details

The run.topGO.meta function:

  • defines which genes are "interesting" and which should be defined as background for each category specified in mydf,

  • makes the GOdata object for topGO,

  • tests each category of interest for enriched GO-terms against all the other genes included in mydf (the "gene universe"),

  • and then outputs results to table (.tsv files that can be opened in Excel).

Enrichments are performed by each ontology (molecular function, biological process, cellular compartment) sequentially on all groups of interest. Results are combined in the final output-table ("Routput/GO/all.combined.GO.results.tsv").

TopGO automatically accounts for genes that cannot be mapped to GO terms (or are mapped to terms with < 3 genes in the analysis) with "feasible genes" indicated in the topGO.log files in the "Routput/GO" folder.

outputs

run.topGO.meta creates several output-files, including:

  • enrichment results,

  • significant genes per significant term,

  • plots of the GO-term hierarchy relevant to the analysis, and

  • thorough log-files for each gene-category of interest tested against the background of all other genes in the analysis.

Primary results from run.topGO.meta will be in "Routput/GO/all.combined.GO.results.tsv".

Concepts for common use-cases

RNAseq: In an RNAseq analysis, common categories might be "upregulated", "downregulated", and "neutral". The gene universe would consist of all genes detected above your threshold cutoffs (not necessarily all genes in the genome).

piggyBac screens: In pooled piggyBac-mutant screening, common categories might be "sensitive", "tolerant", and "neutral". The gene universe would consist of all genes represented in your screened library of mutants (again, not all genes in the genome). See the included exampleMydf as an example.

Using your own custom GO database

A correctly formatted geneID2GO object is included for P. falciparum enrichment analyses (Pfal_geneID2GO). You may also provide your own, so long as it is a named character-vector of GO-terms (each vector named by geneID, with GO terms as each element).

You can use the included formatGOdb.curated() function to format a custom GO database from curated GeneDB/PlasmoDB annotations for several non-model organisms (or the formatGOdb() function to include all GO annotations, if you aren't picky about including automated electronic annotations). If you're studying a model organism, several annotations are already available and can be downloaded through the AnnotationDbi bioconductor package that loads with topGO.

See Also

topGO::topGO()

Examples


run.topGO.meta(exampleMydf,Pfal_geneID2GO_curated)


oberstal/pfGO documentation built on April 22, 2024, 7:15 a.m.