title: 'Fluximplied: A novel approach integrates rate limiting steps and differential expression for pathway analysis' tags: - immunometabolism - pathway analysis - metabolomics - metabolism - flux authors: - name: Mike Sportiello MS orcid: 0000-0003-1690-8702 equal-contrib: true affiliation: "1, 2" # (Multiple affiliations must be quoted) - name: Adam Geber orcid: 0000-0003-3022-0525 equal-contrib: true # (This is how you can denote equal contributions between multiple authors) affiliation: "1, 2" - name: Rohith Palli MD/PhD orcid: 0000-0001-7252-4266 affiliation: 2 - name: A Karim Embong MS orcid: 0000-0002-7224-4640 affiliation: 1 - name: Nathan G. Laniewski orcid: # affiliation: 1 - name: Emma C. Reilly PhD orcid: # affiliation: 3 - name: Kris Lambert Emo orcid: # affiliation: 1 - name: David J. Topham MS/PhD corresponding: true orcid: 0000-0002-9435-8673 affiliation: 1
affiliations: - name: Center for Vaccine Biology and Immunology, University of Rochester Medical Center, Rochester, NY 14642, USA index: 1 - name: Medical Scientist Training Program, University of Rochester Medical Center, Rochester, NY 14642, USA index: 2 - name: Independent Researcher, Rochester, NY 14642, USA index: 3
date: 21 September 2022
Traditional gene set enrichment analysis remains among the most common methods to perform rigorous pathway analyses of large omics datasets. While useful for many applications, one common situation where it is less so is the metabolic profiling of bulk omics datasets. Rate limiting steps (RLSs) in more linear pathways are the main determinant of flux, but differential expression of the enzymes that catalyze these steps is not usually differentially weighted as compared to non-RLSs in pathway analysis. fluximplied
was built to perform pathway analysis with rate limiting steps in mind to assess the implied flux through a number of well validated metabolic pathways. A database of rate limiting steps and their associated pathway was constructed. Unlike traditional approaches to pathway analysis, it specifically queries a database of RLSs in order to infer flux through canonical metabolic pathways.
Pathway analysis is a method for characterizing biology at the systems level which generally presumes that the regulation of a single element (e.g. transcripts in RNA sequencing (RNAseq)) is less important than that of the pathway of interest as a whole. Traditional gene set enrichment analysis (GSEA) uses a list of differentially expressed transcripts (i.e. upregulated or downregulated) to assess if transcripts of a particular pathway are overrepresented in those lists. For example, if all elements of a pathway for T cell killing were upregulated but one, most would take this to be a biologically meaningful finding even though one element is missing. Pathway analysis by gene set enrichment analysis (GSEA), where enrichment of a predefined functional set of genes (T cell killing genes, for example) within a set of user-supplied genes (such as a list of differentially expressed genes from an RNAseq experiment), is assessed using the hypergeometric test.
More advanced methods make use of the particular amount of dysregulation of each particular gene as well as the underlying known regulatory topology of the network [@RN137; @RN143;@RN141; @RN138;@RN139; @RN136;@RN140; @RN142]. One popular and easy-to-use functional enrichment approach, EnrichR, uses Fisher's exact test to quantitatively average differences across a pathway. The recent Boolean Omics Network Invariant-Time Analysis (BONITA) uses Boolean approximation of flow through a network to give extra weight to elements with the highest impact [@RN81]. While these approaches are suitable for signaling cascades, no methods utilize knowledge of chemical equilibria or rate constants to improve analyses of metabolic pathways.
Here we present fluximplied
: a novel, free and open source software for pathway analysis and hypothesis generation. fluximplied
uses a manually curated database of rate limiting steps to predict increases or decreases in flux in a known metabolic pathway using transcriptomic data provided by the user.
fluximplied
can be used to generate hypotheses in the context of metabolic pathway analysis after creating a list of up or down-regulated genes derived from data generated using RNAseq, ATACseq, and other omics technologies. In GSEA, a list of upregulated genes is generated by using a Log2(Fold Change) (LFC) cutoff value to ensure changes are biologically meaningful and an adjusted p value threshold. In glycolysis for example, this type of analysis treats the upregulation of enolase—an enzyme that has little to do with either the regulation or rate of the pathway—the same as phosphofructokinase, which determines the rate and therefore the flux of the pathway as a whole. All else held equal, if the concentration of the RLS increases, the flux through the pathway will increase [@RN74]. Not knowing if all else is held equal, an increase in RLS implies increased flux through the pathway. It is important to note that a singular RLS is rare: more often, the rate is impacted to some degree by more than one enzyme in a pathway. However, more complicated methods like metabolic control analysis are quantitatively complex, tissue dependent, and difficult to integrate with other methods (e.g. RNAseq). Usually one of only a small number of enzymes impact the rate to a meaningful degree, so the model of a singular RLS is still a useful one in analyses and hypothesis generation [@RN70].
Furthermore, traditional enrichment analysis is agnostic of the LFC between conditions (unless an LFC cutoff is used) and thus a gene that has an LFC of 0.5 would be treated the same as a gene that has an LFC of 5, even though one has a fold change of 1.4 and the other 32. While this method is often useful and yields biologically meaningful results, it does not allow the user to more fully interrogate their data.
To that end, we developed fluximplied
to assist the user in generating hypotheses in metabolic pathway analysis and assess implied flux, an objective task typically requiring a time-consuming, large-scale flux balance analysis. Because the software is included with bioinformatic novices in mind, a web-hosted interactive graphical user interface (GUI) was designed, although the tool remains powerful for all users in its capacity to integrate into computational pipelines.
fluximplied
returns a differential expression analysis table of tested RLSs (with assigned adjusted p values and the corresponding pathways) and plots this table for visualization, all of which is downloadable.
If the user does not have access to their differential expression analyses to feed into the software, they may supply a set of genes to fluximplied
. In that scenario, since there is not data like P values or LFCs available for generating a hypothesis if the user only supplies a list of genes, fluximplied
matches any RLS available in such a list to pathways that may be impacted, returning a list of pathways the user may wish to further interrogate.
The Genotype-Tissue Expression (GTEx) Project was supported by the Common of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.