muffleError <- function(x,options) {}
knitr::knit_hooks$set(error=muffleError)
knitr::opts_chunk$set(collapse = TRUE, purl = FALSE)

Introduction

This vignette contains a basic walkthrough of the functionalities of the isoSCAN package. The package is designed to automatically extract the abundances of isotopologues of a targeted list of compounds. It is capable of doing so in both low- and high-resolution data, though depending on the resolution the requirements for the input are different.

Compound List

Targeted compound list format

This package requires a specific targeted compound list format that will be used in autoQ formulaTable argument. This file can be created on Excel or similar software and then imported into R via read.csv. formulaTable must contain the following column names in no specific order:

Other isoSCAN functionalities

The rest of the functions are used for processing the raw data for either quantification or plotting. The package currently contains the following functions.

Creating the formulaTable

Before starting with file processing, we need to load the targeted compounds as a formulaTable data frame. This can be done either with read.table or read.csv functions. Make sure that the file contains it contains the columns as listed in the section above.

The package includes examples for both Low- and High-Resolution:

library(isoSCAN)
data("formulaTables")

# Low-Resolution (e.g. nominal mass accuracy)
formulaTable_lowres <- formulaTables[which(formulaTables$Instrument=="Quadrupole"),]

formulaTable_lowres
# High-Resolution (Orbitrap, or qTOF)
formulaTable_orbi <- formulaTables[which(formulaTables$Instrument=="Orbitrap"),]

formulaTable_orbi

Creating and loading mz(X)ML files

The first step is file format transformation, isoSCAN uses mzR package in order to read MS files. Therefore, you will have to transform the raw data from vendor format into mz(X)ML format using Proteowizard MSconvert (or similar tools), so they can be read by the mzR R package. There is an important parameter to consider in MSconvert depending on the nature of the data resolution: In the case of Low-resolution. Transform the data mantaining profile format. This is essential for peak quantification. (e.g. peakPicking=False in MSconvert) In the case of High-resolution, please use centroiding (e.g. peakPicking= True in MSconvert)

Then, we need to locate the folder in which these files are found and list them in a vector.

setwd("./mydatafolder")
SampleFiles <- list.files(pattern="\\.mz(X)?ML")

This package also includes sample mzML data files to be used for testing:

# Low-resolution files
SampleFiles_lowres <- list.files(system.file("extdata",package = "isoSCAN"),
                                 full.names = T,pattern = "lowres")

#High-resolution files
SampleFiles_orbi <- list.files(system.file("extdata",package = "isoSCAN"),
                               full.names = T,pattern = "orbi")

Processing files

autoQ function

Now we can call autoQ function that will process the files and look for the isotopologues for each compound found in the formulaTable. Additionally, other parameters need to be indicated as stated in help(autoQ). This parameters refer to peak width and number of scans recorded, together with signal-to-noise ratio and mass error.

Low-resolution data

In the case of low-resolution data. Please remember to use them in Profile format as it eases the process of peak finding.

# Low-Resolution
integrations <- autoQ(SampleFiles=SampleFiles_lowres,
                      formulaTable=formulaTable_lowres,
                      resolution = 1,  # low resolution
                      minscans = 6,
                      SNR = 3,
                      mzerror = 0.1,
                      RTwin = 5,
                      maxwidth = 4,
                      minwidth = 1,
                      massdiff = 1.003355)
head(integrations)

High-resolution data

Higher-resolution helps to differ targeted compounds from other ions, though the complexity of the data isoSCAN makes use of enviPat package. Using enviPat it is possible to predict the isotopologue envelope of a formula given a certain resolution, doing so it accurately estimates their m/z and guess whether they can be resolved by the mass analyzer.

It is essential that the formulas in the formulaTable match the derivatized compounds and that the isotopes object contains all the isotopes in the format shown in the follwing 2! rows:

data(isotopes, package="enviPat")
isotopes[isotopes$isotope=="13C",] # both rows required
# High-Resolution
integrations <- autoQ(SampleFiles=SampleFiles_orbi,
                      formulaTable=formulaTable_orbi,
                      resolution = c(6e4,200),  # orbi resolution parameters
                      minscans = 6,
                      SNR = 5,
                      maxppm=5,
                      RTwin = 5,
                      maxwidth = 4,
                      minwidth = 1,
                      isotopes=isotopes, 
                      labelatom="13C")
head(integrations)

This processes each file independently, looking for "good-shape" peaks and obtaining both the area and max intensity scan (maxo) for each isotopologue, if the area cannot be calculated (due to noise or peak shape) then only the Maxo is returned.

Once finished, we can plot them or transform the values for exportation.

Other utilities

Plotting results and value transformation

The metBarPlot function is designed to plot values in a barplot including standard deviation error bars. The arguments for value are the following:

mygroups <- isoSCAN:::rmfileExt(SampleFiles_orbi,"\\.mzML")
mygroups <- gsub(".*_","",mygroups)
mygroups
metBarPlot(autoQres=integrations, groups = mygroups, val.to.plot="area")
# Example of a single compound
integrations <- sumIsotopologues(integrations) #only for high-res
metBarPlot(autoQres=integrations[which(integrations$CompoundName=="Gly"),],
           groups = mygroups, val.to.plot="area")
#if you want to save the plots to a file:
metBarPlot(autoQres=integrations, groups = mygroups, val.to.plot="area",
                     topdf="./metBarPlot_results.pdf",height=10,width=18,pointsize=16)
#modify height, width and pointsize accordingly to fit your output size

metBarPlot can also digest the data frame produced by QTransform. Read the function help for further information.

trans_integ <- QTransform(integrations,val.to.use = "area",val.trans = "P")
metBarPlot(autoQres=trans_integ[which(integrations$CompoundName=="Gly"),], groups = mygroups, val.to.plot="area")

Raw data plotting

rawPlot and meanRawPlot functions should be used for quality control purposes. They are useful to check for moving peaks, noisy spots or saturated peaks. The first function will print a "spectra heatmap", m/z and retention time in the x,y-axis respectively, with points coloured depending on the scan intensity value.

# Example plot from first file and first compound
rawPlot(SampleFiles=SampleFiles_orbi[1],
        formulaTable=formulaTable_orbi[1,],RTwin=5)

While the rawPlot function will run through each file and plot the raw spectra for all compounds in a single pdf file, meanRawPlot will calculate the average spectra for all files and generate a single plot for each compound.

# Example plot for the first compound, note that now we use ALL files
meanRawPlot(SampleFiles=SampleFiles_lowres[c(1,3)],
            formulaTable=formulaTable_lowres[1:2,], 
            RTwin=5)

Saving plots into PDF documents

All plotting functions contain the topdf argument, that allows to save the plots into PDF format files instead of being shown in the R plotting default device. In the case of meanRawPlot or MetBarPlot indicate the name of the output PDF file desired (topdf=/plot_folder/mean_raw_spectra.pdf), whereas in the case of rawPlot should be used as: topdf="C:/.../plot_folder and all plots generated will be saved into plot_folder mantaining the sample name for the pdf file name.



jcapelladesto/isoSCAN documentation built on Jan. 30, 2022, 8:36 p.m.