Welcome to RawHummus

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Robust and reproducible data is essential to ensure high-quality analytical results and is particularly important for large-scale metabolomics studies where detector sensitivity drifts, retention time and mass accuracy shifts frequently occur. Therefore, raw data need to be inspected before data processing to detect measurement bias and verify system consistency.

RawHummus is an R Shiny app designed for an automated raw data quality control (QC) in metabolomics studies. It produces a comprehensive QC report, which contains interactive plots and tables, summary statistics and detailed explanations.

If you are using Thermo Q-Exactive series instrument, RawHummus also allows visualizing the log files (they are usually stored at C:\Xcalibur\system\Exactive\log folder), which contains over 40 different instrument metrics, such as ambient temperature and ambient humidity.

Installation

You need to have R and optionally RStudio Installed. You can download them from this link.

Next you can run this code to install RawHummus.

install.packages('RawHummus')

Usage

Once RawHummus is installed. You can run the following code to start the web app.

library(RawHummus)
run_app()

Workflow and Features

Figure 1. Overview of the main functions in RawHummus



workflow



Table 1. Overview of quality metrics used in RawHummus report

| Section | Metric | Explanation | |--------------|-------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Chromatogram | |
| | TIC plot | It is total ion current at each scan plotted as an intensity point for each raw file. Overlaid TIC plot is used for rapid inspection of RT and ion intensity fluctuations.
| | Summed TIC bar plot | It is summed TIC of all scans in a raw file. It is used to check global ion intensity variations among raw files. |
| | TIC correlation analysis | Pairwise Pearson correlation analysis of raw files. It is used to evaluate chromatogram similarity, i.e., peak shape similarity and RT shift. Pearson correlation coefficient above 0.85 indicates that the two raw files are similar.
| | Max. mass difference (ppm) | It is maximum m/z variation among each selected mass feature across all the raw files. It is used to evaluate the mass accuracy. If the max. mass difference is over 5 ppm, this value will be highlighted in red.
| MS1 | |
| | Max. RT difference (min) | It is maximum retention time variation among each selected mass feature across all the raw files. It used to evaluate RT shifts. If the max. RT difference is over 1 min, this value will be highlighted in red.
| | Max. intensity fold change | It is the maximum intensity fold change among each selected mass feature across all the raw files. It is used to evaluate ion intensity variation. If max. intensity fold change is over 1.5, this value will be highlighted in red. |
| | Intensity CV (%) | It is intensity coefficient of variance (or relative standard deviation, RSD) of each selected mass feature across all the raw files. It is used to evaluate intensity variation. If intensity CV is over 30%, this value will be highlighted in red. | | MS2 | |
| | No. of MS2 events | It is number of triggered MS/MS spectra per file. It is used to evaluate MS2 event
| | Precursor ion distribution across mass plot | It is density plot of the precursor ion across mass range based on the triggered MS/MS events.
| | Precursor ion distribution across RT plot | It is density plot of the precursor ion across RT range based on the triggered MS/MS events. | | | Cosine similarity of precursor ion distribution across mass | It measures the similarity of precursor ion distribution across mass. Cosine similarity score above 0.85 indicates that the precursor distributions across mass are similar between two files.
| | Cosine similarity of precursor ion distribution across RT | It measures the similarity of precursor ion distribution across mass. Cosine similarity score above 0.85 indicates that the precursor distributions across RT are similar between two files.

Data Preparation

RawHummus supports mzXML and mzML formats for analysis, meaning that you need to convert your raw data into one of the format.

It's pretty easy to convert your raw data to the required format. MSConvert is recommended for data conversion. ProteoWizard msConvert can be download from this link.

GNPS has provided a detailed explanation regarding file conversion. You can read the instruction here

Note that: For Waters MSe data, the precursor ion and fragments are stored in two files, and RawHummus will only evaluate the precursor ion.

Demo Data

A list of demo files have been provided, including log files and raw data files. Please use this link to download the demo data.

Citation

If you find this shiny App usful, please consider citing it:

Dong, Y., Kazachkova, Y., Gou, M., Morgan, L., Wachsman, T., Gazit, E. and Birkler, R.I.D., 2022. RawHummus: an R Shiny App for Automated Raw Data Quality Control in Metabolomics. Bioinformatics.



Try the RawHummus package in your browser

Any scripts or data that you put into this service are public.

RawHummus documentation built on April 20, 2023, 9:11 a.m.