library(Risa) library(xcms)
Tomato
An ISAtab archive will contain the metadata description in
several tab-separated files. (One of) the assay files contains the column Raw Spectral Data File
with the paths to the mass spectral raw data files in one of the above formats.
ISAmtbls297 <- readISAtab(find.package("mtbls297")) assay <- ISAmtbls297@assay.tabs[[1]] msfiles <- paste(find.package("mtbls297"), "mzML", assay@assay.file$"Derived Spectral Data File", sep="/")
With the combination of Risa and xcms, we can process the MS raw data:
cwp <- CentWaveParam(ppm = 25, peakwidth = c(10, 20), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = numeric()) raw_data <- readMSData(msfiles, mode = "onDisk") ## Perform the peak detection using the settings defined above. mtbls297 <- findChromPeaks(raw_data, param = cwp, BPPARAM = MulticoreParam())
The result is the same type of xcmsSet object:
x2 <- findChromPeaksIsolationWindow(mtbls297, param = cwp, BPPARAM = MulticoreParam()) cpd <- chromPeakData(x2)
Several options exist to quantify the individual intensities. For each feature, additional attributes are available, such as the minimum/maximum and average retention time and m/z values.
In the following steps, we perform a grouping: because the UPLC system used here has very stable retention times, we just use the retention time correction step as quality control of the raw data. After that, 'fillPeaks()' will integrate the raw data for those features, which were not detected in some of the samples.
mtbls2Set <- group(mtbls2Set, minfrac=1, bw=3) retcor(mtbls2Set, plottype="mdevden")
A first QC step is the visual inspection of intensities across the samples. Alternatively to a boxplot, one could also create histograms/density plots.
boxplot(groupval(mtbls2Set, value="into")+1, col=as.numeric(sampclass(mtbls2Set))+1, log="y", las=2)
After grouping, peaks might be missing/not found in some samples.
fillPekas()
will impute them, using the consensus mz and RT
from the other samples.
mtbls2Set <- fillPeaks(mtbls2Set, nSlaves=nSlaves)
The final xcmsSet represents a rectangular matrix of mass spectral features, which were detected (or imputed) across the samples. The dimensionality is M * N, where M denotes the number of samples in the assay, and N the number of features grouped across the samples.
QC of mass accuracy and retention time consistency
plotQC(mtbls2Set)
In addition to the boxplot for QC, we can also check a hierarchical clustering and the PCA of the samples.
sdThresh <- 4.0 ## Filter low-standard deviation rows for plot data <- log(groupval(mtbls2Set, value="into")+1) pca.result <- pca(data, nPcs=3) plotPcs(pca.result, type="loadings", col=as.numeric(sampclass(mtbls2Set))+1)
an <- xsAnnotate(mtbls2Set, sample=seq(1,length(sampnames(mtbls2Set))), nSlaves=nSlaves) an <- groupFWHM(an) an <- findIsotopes(an) # optional but recommended. an <- groupCorr(an, graphMethod="lpc", calcIso = TRUE, calcCiS = TRUE, calcCaS = TRUE, cor_eic_th=0.5) ## Setup ruleSet rs <- new("ruleSet") rs@ionlistfile <- file.path(find.package("mtbls2"), "lists","ions.csv") rs@neutraladditionfile <- file.path(find.package("mtbls2"), "lists","neutraladdition.csv") rs@neutrallossfile <- file.path(find.package("mtbls2"), "lists","neutralloss.csv") rs <- readLists(rs) rs <- setDefaultParams(rs) rs <- generateRules(rs) an <- findAdducts(an, rules=rs@rules, polarity="positive")
dr <- diffreport(mtbls2Set, sortpval=FALSE, filebase="mtbls2diffreport", eicmax=20 ) cspl <- getPeaklist(an) annotatedDiffreport <- cbind(dr, cspl)
interestingPspec <- tapply(seq(1, nrow(annotatedDiffreport)), INDEX=annotatedDiffreport[,"pcgroup"], FUN=function(x, a) {m <- median(annotatedDiffreport[x, "pvalue"]); p <- max(annotatedDiffreport[x, "pcgroup"]); as.numeric(c(pvalue=m,pcgroup=p))}, annotatedDiffreport) interestingPspec <- do.call(rbind, interestingPspec) colnames(interestingPspec) <- c("pvalue", "pcgroup") o <- order(interestingPspec[,"pvalue"]) pdf("interestingPspec.pdf") dummy <- lapply(interestingPspec[o[1:40], "pcgroup"], function(x) {suppressWarnings(plotPsSpectrum(an, pspec=x, maxlabel=5))}) dev.off()
These attributes and the intensity matrix could already be exported to conform to the specification for the ``metabolite assignment file'' in the mzTab format used in MetaboLights. Currently, this functionality is not included in xcms. A prototype snippet is the following:
``` {r assembleMAF}
pl <- annotatedDiffreport
charge <- sapply(an@isotopes, function(x) { ifelse( length(x) > 0, x$charge, NA) }) abundance <- groupval(an@xcmsSet, value="into")
a.samples <- ISAmtbls2["samples.per.assay.filename"][[ a.filename ]]
maf.std.colnames <- c("identifier", "chemical_formula", "description", "mass_to_charge", "fragmentation", "charge", "retention_time", "taxid", "species", "database", "database_version", "reliability", "uri", "search_engine", "search_engine_score", "modifications", "smallmolecule_abundance_sub", "smallmolecule_abundance_stdev_sub", "smallmolecule_abundance_std_error_sub")
all.colnames <- c(maf.std.colnames, a.samples)
l <- nrow(pl)
maf <- data.frame(identifier = character(l), chemical_formula = character(l), description = character(l), mass_to_charge = pl$mz, fragmentation = character(l), charge = charge, retention_time = pl$rt, taxid = character(l), species = character(l), database = character(l), database_version = character(l), reliability = character(l), uri = character(l), search_engine = character(l), search_engine_score = character(l), modifications = character(l), smallmolecule_abundance_sub = character(l), smallmolecule_abundance_stdev_sub = character(l), smallmolecule_abundance_std_error_sub = character(l), abundance, stringsAsFactors=FALSE)
```r ## ## Make sure maf table is quoted properly, ## and add to the ISAmtbls2 assay file. ## maf_character <- apply(maf, 2, as.character) write.table(maf_character, file=paste(tempdir(), "/a_mtbl2_metabolite profiling_mass spectrometry_maf.csv", sep=""), row.names=FALSE, col.names=all.colnames, quote=TRUE, sep="\t", na="\"\"") ISAmtbls2 <- updateAssayMetadata(ISAmtbls2, a.filename, "Metabolite Assignment File", paste(tempdir(), "/a_mtbl2_metabolite profiling_mass spectrometry_maf.csv", sep="")) write.assay.file(ISAmtbls2, a.filename)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.