README.md

Metabolomics-Data-Portal

Metabolomics Data Portal R shiny application for the visualization and analysis of untargeted metabolomics datasets.

Introduction:

Metabolomics is a fast maturing field which has an intimate relationship to the phenotypes observed in the clinic and is easily actionable for prospective treatment regimens.

Within the field of metabolomics is the distinction between clinical research metabolomics, which follows a case-control cohort design; and clinical testing metabolomics which compares a single patient to a reference population.

Differences in data collection percolate to differences in analysis needs. Currently, for N-of-1 clinical testing metabolomics, state of the art analysis methods rely on pathway enrichment methods.

To quantify perturbations observed in pathway knowledgebases, popular set-based methods such as over-representation analysis (ORA) and metabolite-set enrichment analysis (MSEA) are employed. However, these methods have been criticized for their use of gene sampling in lieu of patient sampling to generate p-values, and for their use of competitive null hypotheses in lieu of self-contained null hypotheses, which have shown to be less powerful (due to their less restrictive nature) in comparison (Goeman & Bulhmann, 2007).

While various tools currently exist for metabolomics data analysis and pathway analysis (e.g., Metabolomics Workbench, PhenoMeNal and Metaboanalyst, Metscape, Mummichog, MetaMapp, and MetDisease), there are many shortcomings to these existing web tools. Some of these platforms employ popular machine learning models to analyze metabolomics data: unsupervised dimensionality reduction methods to view outliers or batch effects, and clustering methods to look for differences between cases and controls. Existing tools are not tailored for single patient analysis (i.e., N-of-1), such as in clinical testing metabolomics, and are more helpful for case-control cohort design data collection methods.

Enter, topological enrichment methods!

Topological enrichment methods (good review papers found in Braun & Shah, 2014 and Ihnatova, Popovici & Budinska, 2018) have shown to be more sensitive than set-based enrichment analysis methods.

Problem

Modern day topological enrichment methods are all narrowly implemented for the analysis/interpretation of differentially expressed gene sets, and do not extend their functionality to the analysis and interpretation of perturbed metabolite sets.

Our Solution

We have examined several R package implementation of existing topological enrichment methods and modified them to be useful for the analysis of metabolite sets, and for an N-of-1 metabolomics data.

Features: 1. Datasets included from published papers including clinical subjects with metabolic diseases (Miller, et al, 2015, Wangler, et al, 2017). 2. Pathway visualization software, importing pathway knowledge curated by Metabolon's Metabolync Cytoscape plugin. 3. New topology-based pathway enrichment analysis methods implemented for intepretation of clinical testing metabolomics data. 4. TO COME:: Private data upload portal to use above tools on private datasets and pathway knowledgebases.

Installation

remotes::install_github("NCBI-Hackathons/Metabolomics-Data-Portal")

Usage

Data formats

Example Shiny Site

docker-compose up -d
docker-compose down
docker build .

References



NCBI-Hackathons/Metabolomics-Data-Portal documentation built on May 31, 2019, 9:59 a.m.