title: "MotrpacBicQC: Metabolomics QC" date: "2024-01-04" output: rmdformats::downcute: code_folding: show self_contained: true thumbnails: false lightbox: true pkgdown: as_is: true
vignette: > %\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{MotrpacBicQC: Metabolomics QC} %\usepackage[UTF-8]{inputenc}
The folder/file structure of a required metabolomics submission is as follows:
Example:
PASS1A-06/
T55/
HILICPOS/
metadata-phase.txt ## Note: "new" required file
file_manifest_YYYYMMDD.txt
BATCH1_20190725/
RAW/
Manifest.txt
file1.raw
file2.raw
etc
PROCESSED_20190725/
metadata_failedsamples_[cas_specific_labeling].txt
NAMED/
results_metabolites_named_[cas_specific_labeling].txt
metadata_metabolites_named_[cas_specific_labeling].txt
metadata_sample_named_[cas_specific_labeling].txt
metadata_experimentalDetails_named_[cas_specific_labeling].txt
UNNAMED/ ## Note: Only required for untargeted assays
results_metabolites_unnamed_[cas_specific_labeling].txt
metadata_metabolites_unnamed_[cas_specific_labeling].txt
metadata_sample_unnamed_[cas_specific_labeling].txt
metadata_experimentalDetails_unnamed_[cas_specific_labeling].txt
With the following file relations...
First, download and install R and RStudio:
Then, open RStudio and install the devtools
package
install.packages("devtools")
Finally, install the MotrpacBicQC
package.
Important: install it every time that you run the QCs to ensure that the latest version is used.
library(devtools)
devtools::install_github("MoTrPAC/MotrpacBicQC", build_vignettes = FALSE)
Load the library
library(MotrpacBicQC)
And run any of the following tests to check that the package is correctly installed and it works. For example:
# Just copy and paste in the RStudio terminal
check_metadata_metabolites(df = metadata_metabolites_named, name_id = "named")
check_metadata_samples(df = metadata_sample_named, cas = "umichigan")
check_results(r_m = results_named, m_s = metadata_sample_named, m_m = metadata_metabolites_named)
which should generate the following output:
check_metadata_metabolites(df = metadata_metabolites_named, name_id = "named")
## + (+) All required columns present
## + (+) `metabolite_name` OK
## + (+) `refmet_name` unique values: OK
## + Validating `refmet_name` (it might take some time)
## + (+) `refmet_name` ids found in refmet: OK
## + (+) {rt} all numeric: OK
## + (+) {mz} all numeric: OK
## + (+) {`neutral_mass`} all numeric values OK
## + (+) {formula} available: OK
check_metadata_samples(df = metadata_sample_named, cas = "umichigan")
## - (-) `metadata_sample`: Expected COLUMN NAMES are missed: FAIL
## The following required columns are not present: `extraction_date, acquisition_date, lc_column_id`
## + (+) `sample_id` seems OK
## + (+) `sample_type` seems OK
## + (+) `sample_order` is numeric
## + (+) `sample_order` unique values OK
## + (+) `raw_file` unique values: OK
## - (-) `extraction_date` column missed: FAIL
## - (-) `acquisition_date` column missed: FAIL
## - (-) `lc_column_id` column missed: FAIL
check_results(r_m = results_named, m_s = metadata_sample_named, m_m = metadata_metabolites_named)
## + (+) All samples from `results_metabolite` are available in `metadata_sample`
## + (+) `metabolite_name` is identical in both [results] and `metadata_metabolites` files: OK
## + (+) `sample_id` columns are numeric: OK
Two approaches available:
PROCESSED_YYYYMMDD
folder (recommended)Run test on the full submission. For that, run the following command:
validate_metabolomics(input_results_folder = "/full/path/to/PROCESSED_YYYYMMDD",
cas = "your_site_code")
cas can be one of the followings:
This function can also print out a number of QC plots, including:
For that, run it like this:
validate_metabolomics(input_results_folder = "/full/path/to/PROCESSED_YYYYMMDD",
cas = "your_site_code",
f_proof = TRUE,
out_qc_folder = "/path/to/the/folder/to/save/plots/",
printPDF = TRUE)
It is recommended to provide the path to the folder where the pdf files should be saved (argument: out_qc_folder
). If it doesn't exist, it will be created.
In the rare case that you need to process individual files, that also can be done. Cases:
Check metadata metabolites:
# Open the metadata_metabolites file(s)
metadata_metabolites_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
metadata_metabolites_unnamed <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
check_metadata_metabolites(df = metadata_metabolites_named, name_id = "named")
check_metadata_metabolites(df = metadata_metabolites_unnamed, name_id = "unnamed")
Check metadata samples:
# Open your files
metadata_sample_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
metadata_sample_unnamed <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
check_metadata_samples(df = metadata_sample_named, cas = "your_side_id")
check_metadata_samples(df = metadata_sample_unnamed, cas = "your_side_id")
Check results, which needs both both metadata metabolites and samples
# Open your files
metadata_metabolites_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
metadata_sample_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
results_named <- read.delim(file = "/path/to/your/file", stringsAsFactors = FALSE)
check_results(r_m = results_named,
m_s = metadata_sample_named,
m_m = metadata_metabolites_named)
Additional details for each function can be found by typing, for example:
?validate_metabolomics
Need extra help? Please, either contact the BIC at motrpac-helpdesk@lists.stanford.edu and/or submit an issue here providing as many details as possible
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.