muffleError <- function(x,options) {} knitr::knit_hooks$set(error=muffleError) knitr::opts_chunk$set(collapse = TRUE, purl = FALSE)
This vignette contains a basic walkthrough of the functionalities of the isoSCAN
package. The package is designed to automatically extract the abundances of isotopologues of a targeted list of compounds. It is capable of doing so in both low- and high-resolution data, though depending on the resolution the requirements for the input are different.
This package requires a specific targeted compound list format that will be used in autoQ formulaTable argument. This file can be created on Excel or similar software and then imported into R via read.csv
. formulaTable must contain the following column names in no specific order:
CompoundName is the name of the compound or metabolite quantified
mz of the monoisotopic ion
RT retention time value in seconds
Formula of the compound. NOTE for high-resolution data: This formula must match the derivatized Formula including derivatizing modifications.
NumAtoms determining the number of compounds to be quantified
Other columns will be ignored
The rest of the functions are used for processing the raw data for either quantification or plotting. The package currently contains the following functions.
Quantifcation:
autoQ
Data transformation
QTransform
simplifybyPpm
sumIsotopologues
Plotting
rawPlot
meanRawPlot
metBarPlot
Before starting with file processing, we need to load the targeted compounds as a formulaTable data frame. This can be done either with read.table
or read.csv
functions. Make sure that the file contains it contains the columns as listed in the section above.
The package includes examples for both Low- and High-Resolution:
library(isoSCAN) data("formulaTables") # Low-Resolution (e.g. nominal mass accuracy) formulaTable_lowres <- formulaTables[which(formulaTables$Instrument=="Quadrupole"),] formulaTable_lowres
# High-Resolution (Orbitrap, or qTOF) formulaTable_orbi <- formulaTables[which(formulaTables$Instrument=="Orbitrap"),] formulaTable_orbi
The first step is file format transformation, isoSCAN
uses mzR
package in order to read MS files. Therefore, you will have to transform the raw data from vendor format into mz(X)ML format using Proteowizard MSconvert (or similar tools), so they can be read by the mzR
R package.
There is an important parameter to consider in MSconvert depending on the nature of the data resolution:
In the case of Low-resolution. Transform the data mantaining profile format. This is essential for peak quantification. (e.g. peakPicking=False in MSconvert)
In the case of High-resolution, please use centroiding (e.g. peakPicking= True in MSconvert)
Then, we need to locate the folder in which these files are found and list them in a vector.
setwd("./mydatafolder") SampleFiles <- list.files(pattern="\\.mz(X)?ML")
This package also includes sample mzML data files to be used for testing:
# Low-resolution files SampleFiles_lowres <- list.files(system.file("extdata",package = "isoSCAN"), full.names = T,pattern = "lowres") #High-resolution files SampleFiles_orbi <- list.files(system.file("extdata",package = "isoSCAN"), full.names = T,pattern = "orbi")
Now we can call autoQ
function that will process the files and look for the isotopologues for each compound found in the formulaTable
.
Additionally, other parameters need to be indicated as stated in help(autoQ). This parameters refer to peak width and number of scans recorded, together with signal-to-noise ratio and mass error.
In the case of low-resolution data. Please remember to use them in Profile format as it eases the process of peak finding.
# Low-Resolution integrations <- autoQ(SampleFiles=SampleFiles_lowres, formulaTable=formulaTable_lowres, resolution = 1, # low resolution minscans = 6, SNR = 3, mzerror = 0.1, RTwin = 5, maxwidth = 4, minwidth = 1, massdiff = 1.003355)
head(integrations)
Higher-resolution helps to differ targeted compounds from other ions, though the complexity of the data isoSCAN makes use of enviPat package. Using enviPat it is possible to predict the isotopologue envelope of a formula given a certain resolution, doing so it accurately estimates their m/z and guess whether they can be resolved by the mass analyzer.
It is essential that the formulas in the formulaTable
match the derivatized compounds and that the isotopes
object contains all the isotopes in the format shown in the follwing 2! rows:
data(isotopes, package="enviPat") isotopes[isotopes$isotope=="13C",] # both rows required
# High-Resolution integrations <- autoQ(SampleFiles=SampleFiles_orbi, formulaTable=formulaTable_orbi, resolution = c(6e4,200), # orbi resolution parameters minscans = 6, SNR = 5, maxppm=5, RTwin = 5, maxwidth = 4, minwidth = 1, isotopes=isotopes, labelatom="13C")
head(integrations)
This processes each file independently, looking for "good-shape" peaks and obtaining both the area and max intensity scan (maxo) for each isotopologue, if the area cannot be calculated (due to noise or peak shape) then only the Maxo is returned.
Once finished, we can plot them or transform the values for exportation.
The metBarPlot
function is designed to plot values in a barplot including standard deviation error bars. The arguments for value are the following:
groups Sample groups. This should be a vector with the groups of experiments, matching the same order in list.files(pattern=".mzXML")
. This vector can be created using gsub
function and others. See the example:
val.to.plot Values to use for plotting. Either "area" or "maxo".
ylabel The text that should be shown in the Y axis (i.e. Intensity, Area). Otherwise, val.to.plot value will be used.
mygroups <- isoSCAN:::rmfileExt(SampleFiles_orbi,"\\.mzML") mygroups <- gsub(".*_","",mygroups) mygroups
metBarPlot(autoQres=integrations, groups = mygroups, val.to.plot="area")
# Example of a single compound integrations <- sumIsotopologues(integrations) #only for high-res metBarPlot(autoQres=integrations[which(integrations$CompoundName=="Gly"),], groups = mygroups, val.to.plot="area")
#if you want to save the plots to a file: metBarPlot(autoQres=integrations, groups = mygroups, val.to.plot="area", topdf="./metBarPlot_results.pdf",height=10,width=18,pointsize=16) #modify height, width and pointsize accordingly to fit your output size
metBarPlot
can also digest the data frame produced by QTransform
. Read the function help for further information.
trans_integ <- QTransform(integrations,val.to.use = "area",val.trans = "P") metBarPlot(autoQres=trans_integ[which(integrations$CompoundName=="Gly"),], groups = mygroups, val.to.plot="area")
rawPlot
and meanRawPlot
functions should be used for quality control purposes. They are useful to check for moving peaks, noisy spots or saturated peaks.
The first function will print a "spectra heatmap", m/z and retention time in the x,y-axis respectively, with points coloured depending on the scan intensity value.
# Example plot from first file and first compound rawPlot(SampleFiles=SampleFiles_orbi[1], formulaTable=formulaTable_orbi[1,],RTwin=5)
While the rawPlot
function will run through each file and plot the raw spectra for all compounds in a single pdf file, meanRawPlot
will calculate the average spectra for all files and generate a single plot for each compound.
# Example plot for the first compound, note that now we use ALL files meanRawPlot(SampleFiles=SampleFiles_lowres[c(1,3)], formulaTable=formulaTable_lowres[1:2,], RTwin=5)
All plotting functions contain the topdf
argument, that allows to save the plots into PDF format files instead of being shown in the R plotting default device. In the case of meanRawPlot
or MetBarPlot
indicate the name of the output PDF file desired (topdf=/plot_folder/mean_raw_spectra.pdf
), whereas in the case of rawPlot
should be used as: topdf="C:/.../plot_folder
and all plots generated will be saved into plot_folder mantaining the sample name for the pdf file name.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.