knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) set.seed(123)
This vignette illustrates Amanida
R package, which contains a collection of functions for computing a weighted meta-analysis in R using significance, relative change and study size. It explains how to conduct the meta-analysis, visualize the results and draw conclusions.
The widespread of metabolomics as a potential tool for clinical diagno-sis has increased the systematic reviews and meta-analysis on this topic. Meta-analysis is the statistical combination in a single estimate for results from primary studies answering the same question, which is a common practice in medical research. When raw data is not available to perform a meta-analysis, there are different approaches that can be applied but them require the standard deviation, that is used for effect size estimate calculation and weighted methods. Nowadays, there isn’t a tool to perform a weighted meta-analysis which could be applied to a dataset of overall results without the standard deviation. Meta-analysis tools base the study of effect size using the difference of means, but in case of metabolomics we found that the comparison between groups is disclosed using relative change. The purpose of amanida is to perform a weighted meta-analysis combining overall results based on statistical significance, relative change and study size. In Amanida, statistical significance and relative change are combined using weighted methods which do no require standard deviation, weighted Fisher for p-values and weighted average for fold-change, where the weight comes from study size.
Furthermore, Amanida also computes qualitative meta-analysis performing a vote-counting for compounds, including the option of only using identifier and trend labels.
We have included an automatized id harmonization step based in Villalba H, Llambrich M, Gumà J, Brezmes J, Cumeras R. A Metabolites Merging Strategy (MMS): Harmonization to Enable Studies’ Intercomparison. Metabolites. 2023; 13(12):1167. https://doi.org/10.3390/metabo13121167. Briefly, all ids are converted to a unique one, in this case the PubChem ID and then checked for duplicates. It also allows to retrieve multiple descriptors such as Molecular Formula, Molecular Weight or InChIKey.
Amanida
can be installed from CRAN
repository, by:
install.packages("amanida")
Then loaded by:
library(amanida)
Here we show how to conduct a meta-analysis using Amanida approach. We are using a dataset obtained in systematic review and meta-analysis: Comprehensive Volatilome and Metabolome Signatures of Colorectal Cancer in Urine: A Systematic Review and Meta-Analysis Mallafré et al. Cancers 2021, 13(11), 2534; https://doi.org/10.3390/cancers13112534.
Before using Amanida
is important the curation of the identifiers. The objective of a meta-analysis is to combine the different results of a same feature, therefore, it is important that all identifiers are in the same format.
Dataset to analyse must include the following columns: identifier, p-value, fold-change, study size (N) and reference.
Once we have the dataset in a supported format (xls/xlsx, csv or txt), we import as follows:
coln = c("Compound Name", "P-value", "Fold-change", "N total", "References") input_file <- getsampleDB() datafile <- amanida_read(input_file, mode = "quan", coln, separator=";")
It creates a tibble which contain the data needed to perform the amanida meta-analysis. Note: missing data is ignored and negative values of fold-change are transformed to positive (1/value).
Before the meta-analysis the IDs can be checked using public databases information. The IDs in format chemical name, InChI, InChIKey, and SMILES are searched in PubChem to transform all into a common nomenclature, PubChem ID. Then duplicates are searched.
datafile <- check_names(datafile)
In this step will be performed the combination of overall results weightening by study size. Methods for combine results are:
P-value: weighted p-values combination, which is a variant of Fisher’s method. A gamma distribution is used to assign non-integral weights proportional to study size to each p-value.
Fold-change: logarithmic transformation (base 2) and then averaged with weighting by study size.
Note: Only with the object created by amanida_read
function in mode = "quan"
are able to compute the amanida meta-analysis.
It is also included a qualitative analysis, the vote-counting, that is computed by the sum of votes assigned as follows: votes are +1 for up-regulation, -1 for down-regulation and 0 if no trend.
amanida_result <- compute_amanida(datafile, comp.inf = F)
If you select the option comp.inf = T
the package will retrieve the PubChem ID from the ID using webchem
. Results will return the following information: PubChem ID, Molecular Formula, Molecular Weight, SMILES, InChIKey, KEGG, ChEBI, HMDB, Drugbank.
Combination results are disclosed in two tables
amanida_result@stat
amanida_results@vote
Amanida meta-analysis gives the combination for p-value and fold-change for each feature.
Trend variable correspond to the direrction of the combination, 1 means that the feature is increased in case group and -1 the opposite. N_total is the sum of the study sizes for that feature, which are disclosed in reference variable.
head(amanida_result@stat)
To observe the results of meta-analysis graphically is done with a volcano plot, where the log10(p-values) are plotted against the log2(fold-change). The cut-offs can be selected by the user, in case of fold-change we recommend values higher than 2, where it is considered to have biological meaningfulness.
Note: cut-off have to be indicated without log scale.
volcano_plot(amanida_result, cutoff = c(0.05,3.5))
We will consider that a feature have statistical significance if it is over the cut-offs threshold (dashed lines).
Optionally we van observe the qualitative results. A bar plot shows the result of vote-counting.
Note: output is restricted to 30 compounds to facilitate the readability. To visualize less compounds increase counts
parameter.
vote_plot(amanida_result, counts = 1)
Note: The output include compound identifiers from public repositories such as PubChem or the Human Metabolite DataBase, if compounds are not used this could be avoided with comp.inf = F
.
With vote plot discrepancies in compounds behaviour are not detected at first glance, and we suggest to combine the results with the explore plot. Discrepancies in results are shown when type = "mix"
is used:
Note: output is restricted to 25 compounds to facilitate the readability. To visualize less compounds increase counts
parameter.
explore_plot(sample_data, type = "mix", counts = 1)
In sample data we observe some compounds over the cut-off threshold, then we check the consistency of result. In this case, compounds with more than one report and statistical significance are Hippuric acid and Phenol. Both of them have consistency as all reports results are in the same trend. We can conclude that for Hippuric acid and Phenol have evidences to reject the null hypothesis.
Note: compounds with both up-regulated and down-regulated tendencies points to low consistency.
All computations and functions discussed above can be obtained straightforward in an html document. From scratch amanida_report
computes the meta-analysis and display the results graphically.
It needs the following parameters:
column_id = c("Compound Name", "P-value", "Fold-change", "N total", "References") input_file <- getsampleDB() amanida_report(input_file, separator = ";", analysis_type = "quan-qual", column_id = column_id, pvalue_cutoff = 0.05, fc_cutoff = 4, votecount_lim = 2, comp_inf = F)
The report will be saved in the working directory.
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.