suppressPackageStartupMessages(library("synapter")) suppressPackageStartupMessages(library("synapterdata")) suppressPackageStartupMessages(library("BiocStyle")) synobj2RData()
Here we describe the new functionality implemented in synapter 2.0. Namely this vignette covers the utilisation of the new 3D grid search, the fragment matching, intensity modeling and correction of detector saturation.
The synapter2 workflow is similar to the old one in synapter1.
First it is necessary to use PLGS to create the csv (and xml) files.
Therefore we refer the reader to the default
r Biocpkg("synapter") vignette,
vignette("synapter", package = "synapter").
In contrast to the original workflow the
final_fragment.csv file for the
identification run and a
Spectrum.xml file for the quantification run are
needed if the fragment matching should be applied.
Subsequently the original workflow is enhanced by the new
3D grid search and the intensity modeling.
Afterwards the fragment matching could be applied.
r Biocpkg("MSnbase") [@Gatto2012] is used for further analysis.
r Biocpkg("synapter") adds
synapter/PLGS consensus filtering
and the detector saturation correction for
To demonstrate a typical step-by-step workflow we use example data
that are available on http://proteome.sysbiol.cam.ac.uk/lgatto/synapter/data/.
There is also an
synobj2 object in
r Biocexptpkg("synapterdata") which
contains the same data.
Synapter constructor uses a named
list of input files. Please note that
Spectrum.xml) because we want to apply the fragment matching
cat(readLines(system.file(file.path("scripts", "create_synobj2.R"), package="synapterdata"), n=13), sep="\n")
The first steps in each
r Biocpkg("synapter") analysis are filtering by
peptide sequence, peptide length, ppm error and false positive rate.
Here we use the default values for each method. But the accompanying plotting methods should be used to find the best threshold:
filterUniqueDbPeptides(synobj2, missedCleavages=0, IisL=TRUE) filterPeptideLength(synobj2, l=7) plotFdr(synobj2) filterQuantPepScore(synobj2, method="BH", fdr=0.05) filterIdentPepScore(synobj2, method="BH", fdr=0.05) par(mfcol=c(1, 2)) plotPpmError(synobj2, what="Ident") plotPpmError(synobj2, what="Quant") par(mfcol=c(1, 1)) filterQuantPpmError(synobj2, ppm=20) filterIdentPpmError(synobj2, ppm=20) plotPepScores(synobj2) filterIdentProtFpr(synobj2, fpr=0.05) filterQuantProtFpr(synobj2, fpr=0.05)
Next we merge the identified peptides from the identification run and quantification run and build a LOWESS based retention time model to remove systematic shifts in the retention times. Here we use the default values but as stated above the plotting methods should be used to find sensible thresholds.
mergePeptides(synobj2) plotRt(synobj2, what="data") setLowessSpan(synobj2, span=0.05) modelRt(synobj2) par(mfcol=c(1, 2)) plotRtDiffs(synobj2) plotRt(synobj2, what="model", nsd=1) par(mfcol=c(1, 1)) plotFeatures(synobj2, what="all", ionmobility=TRUE)
To find EMRTS (exact m/z-retention time pairs) we try are running a grid search to find the best retention time tolerance and m/z tolerance that results in the most correct one-to-one matching in the merged (already identified) data. If the identification and quantitation run are HDMS$^E$ data we could use the new 3D grid search that looks for the best matching in the retention time, m/z and ion mobility (drift time) domain to increase the accuracy. If one or both datasets are MS$^E$ data it falls back to the traditional 2D grid search.
searchGrid(synobj2, imdiffs=seq(from=0.6, to=1.6, by=0.2), ppms=seq(from=2, to=20, by=2), nsds=seq(from=0.5, to=5, by=0.5)) setBestGridParams(synobj2) findEMRTs(synobj2) plotEMRTtable(synobj2)
For the details of the fragment matching procedure we refer to the
fragment matching vignette that is available
vignette("fragmentmatching", package = "synapter").
Briefly we compare the fragments of the identification run with the spectra from
the quantification run and remove entries where there are very few/none common
peaks/fragments between them.
First we starting by removing less intense fragments and peaks.
filterFragments(synobj2, what="fragments.ident", minIntensity=70) filterFragments(synobj2, what="spectra.quant", minIntensity=70)
Next we look for common peaks via
We get tables for unique and non-unique matches:
knitr::kable(fragmentMatchingPerformance(synobj2, what="unique")) knitr::kable(fragmentMatchingPerformance(synobj2, what="non-unique"))
Subsequently we could filter by minimal accepted common peaks:
filterUniqueMatches(synobj2, minNumber=1) filterNonUniqueMatches(synobj2, minDelta=2) filterNonUniqueIdentMatches(synobj2)
Finally we rescue EMRTs that are filtered but were identified by PLGS:
In a similar manner as correcting for the retention time drift we correct
systematic errors of the intensity via a
modelIntensity has to applied after
findEMRTs. The model is
build on the merged peptides as it is done for the retention time model. But in
contrast to the retention time model the prediction is necessary for the matched
plotIntensity(synobj2, what="data") setLowessSpan(synobj2, 0.05) modelIntensity(synobj2) plotIntensity(synobj2, what="model", nsd=1)
The whole workflow described in the step-by-step workflow is
wrapped in the
synergise2 function. As side effect it generates a nice
HTML report. An example could be found on https://github.com/lgatto/synapter.
synobj2 <- synergise2(filenames = inlist, outputdir = ".")
For the next steps we need to convert the
Synapter object into an
msn <- as(synobj2, "MSnSet")
Subsequently we look for synapter/PLGS agreement (this is more
useful for a combined
MSnSet; see basic
vignette("synapter", package = "synapter")).
synapterPlgsAgreement adds an agreement column for each sample and counts the
agreement/disagreement in additional columns:
msn <- synapterPlgsAgreement(msn) knitr::kable(head(fData(msn)[, grepl("[Aa]gree", fvarLabels(msn)), drop=FALSE]))
As described in [@Shliaha2013] Synapt G2 devices suffer from detector
saturation. This could be partly corrected by
requantify. Therefore a
saturationThreshold has to be given above that intensity saturation
potentially happens. There are several methods available.
msncor <- requantify(msn, saturationThreshold=1e5, method="sum")
MSnSet object was requantified using the
requantification method TOP3 normalisation is not valid anymore
because the most abundant proteins are penalised by removing high intensity
isotopes (for details see
?rescaleForTop3). This could be
overcome by calling
msncor <- rescaleForTop3(before=msn, after=msncor, saturationThreshold=1e5)
r Biocpkg("synapter") 2.0
makeMaster supports fragment files as well.
It is possible to create a fragment library that could used for
fragment matching because of the large data this could not
covered in this vignette. An introduction how to create a master could be
found in the basic
r Biocpkg("synapter") vignette, available
vignette("synapter", package = "synapter"). Please find details
about creating a fragment library in
All software and respective versions used to produce this document are listed below.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.