View source: R/synapter-interface.R
| Synapter | R Documentation |
A reference class to store, manage and process Synapt G2 data to combine identification and quantitation results.
The data, intermediate and final results are stored together in such a ad-how container called a class. In the frame of the analysis of a set of 3 or 5 data files, namely as identification peptide, a quantitation peptide and a quantitation Pep3D, and identification fragments and quantitation spectra, such a container is created and populated, updated according to the user's instructions and used to display and export results.
The functionality of the synapter package implemented in the
Synapter class in described in the Details section
below. Documentation for the individual methods is provided in the
Methods section. Finally, a complete example of an analysis is
provided in the Examples section, at the end of this document.
See also papers by Shliaha et al. for details about ion mobility
separation and the manuscript describing the synapter
methodology.
Synapter(filenames, master) ## creates an instance of class 'Synapter'
filenames |
A named |
master |
A |
A Synapter object logs every operation that is applied to
it. When displayed with show or when the name of the instance
is typed at the R console, the original input file names, all
operations and resulting the size of the respective data are
displayed. This allows the user to trace the effect of respective
operations.
The construction of the data and analysis container, technically
defined as an instance or object of class Synapter, is created
with the Synapter constructor.
This function requires 4 or 6 files as input, namely,
the identification final peptide csv file, the quantitation final peptide
csv file, the quantitation Pep3D csv file (as exported from the PLGS
software), the fasta file use for peptide identification, and optional
the identification fragments csv file and the quantitation
spectra xml file. The fasta file ('fasta') could be an RDS file generated by
link{createUniquePeptideDbRds}, too.
The file names need to be specified as a named list with names
'identpeptide', 'quantpeptide', 'quantpep3d', 'fasta', 'identfragments'
and 'quantspectra' respectively.
These files are read and the data is stored in the newly
created Synapter instance.
The final peptide files are filtered
to retain peptides with matchType corresponding to
PepFrag1 and PepFrag2, corresponding to unmodified
round 1 and 2 peptide identification. Other types, like
NeutralLoss_NH3, NeutralLoss_H20, InSource,
MissedCleavage or VarMod are not considered in the rest
of the analysis. The quantitation Pep3D data is filtered to retain
Function equal to 1 and unique quantitation spectrum ids,
i.e. unique entries for multiple charge states or isotopes of an EMRT
(exact mass-retention time features).
Then, p-values for Regular peptides are computed based on
the Regular and Random database types score
distributions, as described in Käll et al.,
2008a. Only unique peptide sequences are taken into account:
in case of duplicated peptides, only one entry is kept.
Empirical p-values are adjusted using Bonferroni
and Benjamini and Hochberg, 1995 (multtest package)
and q-values are computed using the qvalue package
(Storey JD and Tibshirani R., 2003 and Käll et
al., 2008b). Only Regular entries are stored in the
resulting data for subsequent analysis.
The data tables can be exported as csv spreadsheets with the
writeIdentPeptides and writeQuantPeptides methods.
The first step of the analysis aims to match reliable peptide.
The final peptide datasets are
filtered based on the FDR (BH is default) using the
filterQuantPepScore and filterIdentPepScore
methods. Several plots are provided to illustrate peptide score
densities (from which p-values are estimated, plotPepScores;
use getPepNumbers to see how many peptides were available) and
q-values (plotFdr).
Peptides matching to multiple proteins in the fasta file (non-unique
tryptic identification and quantitation peptides) can be
discarded with the filterUniqueDbPeptides method. One can
also filter on the peptide length using filterPeptideLength.
Another filtering criterion is mass accuracy. Error tolerance
quantiles (in ppm, parts per million) can be visualised with the
plotPpmError method. The values can be retrieved with
getPpmErrorQs. Filtering is then done separately for
identification and quantitation peptide data using
filterIdentPpmError and filterQuantPpmError
respectively. The previous plotting functions can be used again to
visualise the resulting distribution.
Filtering can also be performed at the level of protein false
positive rate, as computed by the PLGS application
(protein.falsePositiveRate column), which counts the
percentage of decoy proteins that have been identified prior to the
regular protein of interest. This can be done with the
filterIdentProtFpr and filterQuantProtFpr methods.
Note that this field is erroneously called a false positive rate in
the PLGS software and the associated manuscript; it is a false
discovery rate.
Common and reliable identification and quantitation peptides are
then matched based on their sequences and merged using the
mergePeptides method.
Systematic differences between identification features and
quantitation features retention times are modelled by
fitting a local regression (see the loess function for
details), using the modelRt method. The smoothing parameter,
or number of neighbour data points used the for local fit, is
controlled by the span parameter that can be set in the above
method.
The effect of this parameter can be observed with the plotRt
method, specifying what = "data" as parameters. The resulting
model can then be visualised with the above method specifying
what = "model", specifying up to 3 number of standard
deviations to plot. A histogram of retention time differences can
be produced with the plotRtDiffs method.
To visualise the feature space plotFeatures could be used. It
generates one or two (if ion mobility is available) plots of
retention time vs mass and mass vs ion mobility for each data source,
namely, Identification data, Quantitation data and Quantitation Pep3D data.
Systematic differences between intensities of identification features and
quantitation features depending on retention times are modelled by
fitting a local regression (see the loess function for
details), using the modelIntensity method. The smoothing parameter,
or number of neighbour data points used the for local fit, is
controlled by the span parameter that can be set in the above
method.
The effect of this parameter can be observed with the plotIntensity
method, specifying what = "data" as parameters. The resulting
model can then be visualised with the above method specifying
what = "model", specifying up to 3 number of standard
deviations to plot.
Matching of identification peptides and quantitation EMRTs is done within a mass tolerance in parts per million (ppm) and the modelled retention time +/- a certain number of standard deviations. To help in the choice of these two parameters, a grid search over a set of possible values is performed and performance metrics are recorded, to guide in the selection of a 'best' pair of parameters.
The following metrics are computed:
(1) the percentage of identification
peptides that matched a single quantitation EMRT (called prcntTotal),
(2) the percentage of identification peptides used in the retention time
model that matched the quantitation EMRT corresponding to the
correct quantitation peptide in ident/quant pair of the model
(called prcntModel)
and
(3) the detailed about the matching of the features used for
modelling (accessible with getGridDetails) and the
corresponding details grid that reports the percentage of
correct unique assignments.
The detailed grid results specify the number of non
matched identification peptides (0), the number of correctly (1) or
wrongly (-1) uniquely matched identification peptides, the number of
identification peptides that matched 2 or more peptides including
(2+) or excluding (2-) the correct quantitation equivalent are also
available.
See the next section for additional details about how matching.
The search is performed with the searchGrid method, possibly
on a subset of the data (see Methods and Examples sections for
further details).
The parameters used for matching can be set manually with
setPpmError, setRtNsd, setImDiff respectively,
or using setBestGridParams to apply best parameters as defined using
the grid search. See example and method documentation for details.
The identification peptide - quantitation EMRT matching, termed identification transfer, is performed using the best parameters, as defined above with a grid search, or using user-defined parameters.
Matching is considered successful when one and only one EMRT is found in the mass tolerance/retention time/ion mobility window defined by the error ppm, number of retention time standard deviations, and ion mobility difference parameters. The values of uniquely matched EMRTs are reported in the final matched dataframe that can be exported (see below). If however, none or more than one EMRTs are matched, 0 or the number of matches are reported.
As identification peptides are serially individually matched to 'close' EMRTs, it is possible for peptides to be matched the same EMRT independently. Such cases are reported as -1 in the results dataframes.
The results can be assess using the plotEMRTtable (or
getEMRTtable to retrieve the values) and performace
methods. The former shows the number of identification peptides assigned to
none (0), exactly 1 (1) or more (> 2) EMRTs.
The latter method reports matched identification peptides, the number of
(q-value and protein FPR filtered) identification and quantitation peptides.
Matched EMRT and quantitation peptide numbers are then compared
calculating the synapter enrichment (100 * ( synapter - quant ) / quant)
and Venn counts.
As an additional step it is possible to remove less intense peaks from the
spectra and fragment data. Use plotCumulativeNumberOfFragments to
plot the number of fragments vs the intensity and to find a good threshold.
The filterFragments method could remove peaks if the intensity is
below a specified threshold via the minIntensity argument. Set the
maxNumber argument to keep only the maxNumber highest
peaks/fragments. The what argument controls the data on which the
filter is applied. Use what = "fragments.ident" for the
identification fragments and what = "spectra.quant" for the
quantiation spectra data.
After importing fragment and spectra data it is possible to
match peaks between the identification fragments and the quantitation
spectra using the fragmentMatching method.
Use setFragmentMatchingPpmTolerance to set
the maximal allowed tolerance for considering a peak as identical.
There are two different methods to visualise the results of the fragment
matching procedure. plotFragmentMatching plots the fragments and
spectra for each considered pair.
plotFragmentMatchingPerformance draws two plots. On the left panel
you could see the performance of different thresholds for the number of
common peaks for unique matches. The right panel visualizes the performance
of different differences (delta) of common peaks between the best match
(highest number of common peaks) and the second best match in each non
unique match group. plotFragmentMatchingPerformance returns the
corresponding values invisible or use fragmentMatchingPerformance to
access these data.
Use filterUniqueMatches and filterNonUniqueMatches to remove
unique or non unique matches below the threshold of common peaks
respective the difference in common peaks from the MatchedEMRTs data.frame.
The merged identification and quantitation peptides can be exported
to csv using the writeMergedPeptides method. Similarly, the
matched identification peptides and quantitation EMRTs are exported
with writeMatchedEMRTs.
Complete Synapter instances can be serialised with
save, as any R object, and reloaded with load for
further analysis.
It is possible to get the fragment and spectra data from the identification
and quantitation run using getIdentificationFragments respectively
getQuantitationSpectra.
signature(object = "Synapter"): Merges
quantitation and identification final peptide data, used to
perform retention time modelling (see modelRt below).
signature(object = "Synapter",
span = "numeric"): Performs local polynomial regression
fitting (see loess) retention time alignment using
span parameter value to control the degree of smoothing.
signature(object = "Synapter",
span = "numeric"): Performs local polynomial regression
fitting (see loess) intensity values using
span parameter value to control the degree of smoothing.
signature(object = "Synapter", ppm =
"numeric", nsd = "numeric", imdiff = "numeric"):
Finds EMRTs matching identification peptides using ppm
mass tolerance, nsd number of retention time standard
deviations and imdiff difference in ion mobility.
The last three parameters are optional if previously
set with setPpmError, setRtNsd, setImDiff,
or, better, setBestGridParams (see below).
signature(object = "Synapter",
method = c("rescue", "copy")):
The method parameter defined the behaviour for those
high confidence features that where identified in both identification and
quantitation acquisitions and used for the retention time model
(see mergePeptides). Prior to version 1.1.1, these
features were transferred from the quantitation pep3d file if
unique matches were found, as any feature ("transfer").
As a result, those matching 0 or > 1 EMRTs were
quantified as NA. The default is now to "rescue"
the quantitation values of these by directly retrieving the data
from the quantification peptide data. Alternatively, the
quantitation values for these features can be directly taken
from the quantitation peptide data using "copy", thus
effectively bypassing identification transfer.
signature(object="Synapter",
ppms="numeric", nsds="numeric", imdiffs = "numeric",
subset="numeric", n = "numeric", verbose="logical"):
Performs a grid search. The
grid is defined by the ppm, nsd and imdiffs
numerical vectors, representing the sequence of values to be
tested. Default are seq(5, 20, 2), seq(0.5, 5,
0.5), seq(0.2, 2, 0.2) respectively. To ignore ion mobility set
imdiffs = Inf.
subset and n allow to use a
randomly chosen subset of the data for the grid search to
reduce search time. subset is a numeric, between 0 and 1,
describing the percentage of data to be used; n specifies
the absolute number of feature to use. The default is to use all
data. verbose controls whether textual output should be
printed to the console. (Note, the mergedEMRTs value used
in internal calls to findEMRTs is "transfer" - see
findEMRTs for details).
signature(object="Synapter",
ppm = "numeric", verbose = "logical":
Performs a fragment matching between spectra and fragment data.
The ppm argument controls the tolerance that is used to consider
two peaks (MZ values) as identical. If verbose is TRUE
(default) a progress bar is shown.
signature(object = "Synapter"): Display
object by printing a summary to the console.
signature(x="Synapter"): Returns a list
of dimensions for the identification peptide, quantitation
peptide, merged peptides and matched features data sets.
signature(object="Synapter"): Returns a
character of length 6 with the names of the input files
used as identpeptide, quantpeptide,
quantpep3d, fasta, identfragments and
quantspectra.
signature(object="Synapter"): Returns a
character of variable length with a summary of processing
undergone by object.
signature(object="Synapter", digits =
"numeric"): Returns a named list of length 3 with the
precent of total (prcntTotal), percent of model
(prcntModel) and detailed (details) grid search
results. The details grid search reports the proportion
of correctly assigned features (+1) to all unique assignments
(+1 and -1). Values are rounded to 3 digits by default.
signature(object="Synapter"): Returns
a list of number of ..., -2, -1, 0, +1, +2, ... results
found for each of the ppm/nsd pairs tested during
the grid search.
signature(object="Synapter"):
Returns a named numeric of length 3 with best grid values
for the 3 searches. Names are prcntTotal,
prcntModel and details.
signature(object="Synapter"):
Returns a named list of matrices (prcntTotal,
prcntModel and details). Each matrix gives the
ppm and nsd pairs that yielded the best grid
values (see getBestGridValue above).
signature(object="Synapter",
what="character"): This methods set the best parameter pair,
as determined by what. Possible values are auto
(default), model, total and details. The 3
last ones use the (first) best parameter values as reported by
getBestGridParams. auto uses the best model
parameters and, if several best pairs exists, the one that
maximises details is selected.
signature(object="Synapter", fdr =
"numeric"): Sets the peptide score false discovery rate
(default is 0.01) threshold used by filterQuantPepScore
and filterIdentPepScore.
signature(object="Synapter"): Returns
the peptide false discrovery rate threshold.
signature(object="Synapter", ppm =
"numeric"): Set the identification mass tolerance to
ppm (default 10).
signature(object="Synapter"):
Returns the identification mass tolerance.
signature(object="Synapter", ppm =
"numeric"): Set the quantitation mass tolerance to ppm
(default 10).
signature(object="Synapter"):
Returns the quantitation mass tolerance.
signature(object="Synapter", ppm =
"numeric"): Sets the identification and quantitation mass
tolerance ppm (default is 10).
signature(object="Synapter", span =
"numeric"): Sets the loess span parameter;
default is 0.05.
signature(object="Synapter"): Returns
the span parameter value.
signature(object="Synapter", nsd =
"numeric"): Sets the retention time tolerance nsd,
default is 2.
signature(object="Synapter"): Returns the
value of the retention time tolerance nsd.
signature(object="Synapter", imdiff =
"numeric"): Sets the ion mobility tolerance imdiff,
default is 0.5.
signature(object="Synapter"): Returns the
value of the ion mobility tolerance imdiff.
signature(object="Synapter", qs =
"numeric", digits = "numeric"): Returns the mass tolerance
qs quantiles (default is c(0.25, 0.5, 0.75, seq(0.9,
1, 0.01)) for the identification and quantitation
peptides. Default is 3 digits.
signature(object="Synapter", qs =
"numeric", digits = "numeric"): Returns the retention time
tolerance qs quantiles (default is c(0.25, 0.5,
0.75, seq(0.9, 1, 0.01)) for the identification and
quantitation peptides. Default is 3 digits.
signature(object="Synapter"): Returns
the number of regular and random quantitation and identification
peptide considered for p-value calculation and used to plot the
score densities (see plotPepScores). Especially the
difference between random and regular entries are informative in
respect with the confidence of the random scores distribution.
signature(object="Synapter",
ppm = "numeric"): Sets the fragment matching mass tolerance ppm
(default is 25).
signature(object="Synapter"):
Returns the fragment matching mass tolerance in ppm.
signature(object="Synapter", k =
"numeric"): Returns a named list of length
2 with the proportion of identification and quantitation
peptides that are considered significant with a threshold of
k (default is c(0.001, 0.01, 0.5, 0.1)) using raw
and adjusted p-values/q-values.
signature(object="Synapter"): Returns a
table with the number of 0, 1, 2, ... assigned EMRTs.
signatute(object="Synapter", verbose =
TRUE): Returns (and displays, if verbose) the
performance of the synapter analysis.
signatute(object="Synapter", verbose =
TRUE): Returns (and displays, if verbose) information
about number of missing values and identification source of
transfered EMRTs.
signature(object="Synapter",
what = c("unique", "non-unique"):
Returns the performance of the fragment matching for unqiue or
non-unique matches. The return valus is a matrix with
seven columns. The first column ncommon/deltacommon
contains the thresholds. Column 2 to 5 are the true positives tp,
false positives fp, true negatives tn, false negatives
fn for the merged peptide data. The sixth column all shows
the corresponding number of peptides for all peptides
(not just the merged ones) and the last column shows the FDR fdr
for the current threshold (in that row) for the merged data.
Please note that the FDR is overfitted/underestimated because the merged peptides are the peptides from the highest quality spectra were PLGS could easily identify the peptides. The peptides that are not present in the merged data are often of lower quality hence the FDR would be higher by trend.
See plotFragmentMatchingPerformance for a graphical
representation.
signature(object="Synapter",
missedCleavages = 0, IisL = TRUE, verbose = TRUE):
This method first digests the fasta database file and keeps
unique tryptic peptides. (NOTE: since version 1.5.3, the tryptic
digestion uses the cleaver package, replacing the more
simplistic inbuild function. The effect of this change is
documented in https://github.com/lgatto/synapter/pull/47).
The number of maximal missed cleavages can be set as missedCleavages
(default is 0).
If IisL = TRUE Isoleucin and Leucin are treated as the same aminoacid.
In this case sequences like "ABCI", "ABCL" are removed
because they are not unqiue anymore. If IisL = FALSE (default)
"ABCI" and "ABCL" are reported as unique.
The peptide sequences are then used as a
filter against the identification and quantitation peptides,
where only unique proteotyptic instances (no miscleavage allowed
by default) are eventually kept in the object instance.
This method also removes any additional duplicated peptides,
that would not match any peptides identified in the fasta
database.
signature(object="Synapter",
missedCleavages = 0, IisL = TRUE, verbose = TRUE): As
filterUniqueDbPeptides for quantitation peptides only.
signature(object="Synapter",
missedCleavages = 0, IisL = TRUE, verbose = TRUE): As
filterUniqueDbPeptides for identification peptides
only.
signature(object="Synapter", fdr
= "numeric", method = "character"): Filters the quantitation
peptides using fdr false discovery rate. fdr is
missing by default and is retrieved with getPepScoreFdr
automatically. If not set, default value of 0.01 is
used. method defines how to performe p-value adjustment;
one of BH, Bonferrone or qval. See details
section for more information.
signature(object="Synapter", fdr
= "numeric", method = "charactet"): As
filterQuantPepScore, but for identification peptides.
signature(object="Synapter", fpr
= "numeric"): Filters quantitation peptides using the protein
false positive rate (erroneously defined as a FPR, should be
FDR), as reported by PLGS, using threshold set by fpr
(missing by default) or retrieved by getProtFpr.
signature(object="Synapter", fpr =
"numeric"): as filterQuantProtFpr, but for
identification peptides.
signature(object="Synapter", ppm
= "numeric"): Filters the quantitation peptides based on the
mass tolerance ppm (default missing) provided or
retrieved automatically using getPpmError.
signature(object="Synapter"): as
filterQuantPpmError, but for identification peptides.
signature(object = "Synapter",
what = c("fragments.ident", "spectra.quant"),
minIntensity = "numeric", maxNumber = "numeric", verbose = "logical"):
Filters the spectra/fragment data using a minimal intensity threshold
(minIntensity) or a maximal number of peaks/fragments threshold
(maxNumber). Please note that the maximal number is transfered to
an intensity threshold and the result could contain less peaks than
specified by maxNumber.
If both arguments are given, the more aggressive one is chosen.
Use the what argument to specify the data that should be filtered.
Set what = "fragments.ident" for the identification fragment data
or what = "spectra.quant" for the quantiation spectra.
If verbose is TRUE (default) a progress bar is shown.
signature(object="Synapter", minNumber =
"numeric"):
Removes all unique matches that have less than minNumber of
peaks/fragments in common. Use
fragmentMatchingPerformance(..., what="unique")/
plotFragmentMatchingPerformance
(left panel) to find an ideal threshold.
signature(object="Synapter", minDelta =
"numeric"):
Removes all non unique matches that have a difference between the best
match (highest number of common peaks/fragments, treated as true match)
and the second best match (second highest number of common
peaks/fragments) less than minDelta. For the matches above the
threshold only the one with the highest number of common peaks/fragments
in each match group is kept.
Use fragmentMatchingPerformance(..., what="non-unique")/
plotFragmentMatchingPerformance (right panel) to find an ideal
threshold.
signature(object="Synapter"):
Removes all non unique identification matches. In rare circumstances (if
the grid search parameters are to wide/relaxed or a fragment library is
used) it could happen that the searchGrid methods matches a single
quantification EMRT to multiple identification EMRTs. This methods removes
all these non unique matches.
signature(object="Synapter", what =
"character"): Plots the proportion of data against the mass
error tolerance in ppms. Depending on what, the data
for identification (what = "Ident"), quantitation
(what = "Quant") or "both" is plotted.
signature(object="Synapter", ...): Plots
a histogram of retention time differences after
alignments. ... is passed to hist.
signature(object="Synapter", what =
"character", f = "numeric", nsd = "numeric"): Plots the
Identification - Quantitation retention time difference as a
function of the Identification retention time. If what is
"data", two plots are generated: one ranging the full range
of retention time differences and one focusing on the highest data
point density and showing models with various span
parameter values, as defined by f (default is 2/3, 1/2,
1/4, 1/10, 1/16, 1/25, 1/50, passed as a numed numeric). If
what is "model", a focused plot with the applied
span parameter is plotted and areas of nsd (default is
x(1, 3, 5) number of standard deviations are shaded around
the model.
signature(object="Synapter", what =
"character", f = "numeric", nsd = "numeric"): Plots the (log2) ratio
of Identification and Quantitation intensities as a
function of the Identification retention time. If what is
"data", two plots are generated: one ranging the full range
of ratios and one focusing on the highest data
point density and showing models with various span
parameter values, as defined by f (default is 2/3, 1/2,
1/4, 1/10, 1/16, 1/25, 1/50, passed as a numed numeric). If
what is "model", a focused plot with the applied
span parameter is plotted and areas of nsd (default is
x(1, 3, 5) number of standard deviations are shaded around
the model.
signature(object="Synapter"): Plots the
distribution of random and regular peptide scores for
identification and quantitation features. This reflects how
peptide p-values are computed. See also getPepNumbers.
signature(object="Synapter", method =
"character"): Displays 2 plots per identification and
quantitation peptides, showing the number of significant
peptides as a function of the FDR cut-off and the expected false
number of false positive as a number of significant
tests. PepFrag 1 and 2 peptides are illustrated on the same
figures. These figures are adapted from plot.qvalue.
method, one of "BH", "Bonferroni" or
"qval", defines what identification statistics to use.
signature(object="Synapter"): Plots
the barchart of number or 0, 1, 2, ... assigned EMRTs (see
getEMRTtable) .
signature(object="Synapter", what =
"character"), maindim = "character":
Plots a heatmap of the respective grid search
results. This grid to be plotted is controlled by what:
"total", "model" or "details" are
available. If ion mobility was used in the grid search you can use
maindim to decided which dimensions should be shown. maindim
could be one of "im" (default), "rt" and "mz". If
maindim = "im" a heatmap for each ion mobility threshold is drawn.
For maindim = "rt" and maindim you get a heatmap for each
retention time respective mass threshold.
signature(object="Synapter", what =
"character", xlim = "numeric", ylim = "numeric", ionmobiltiy = "logical"):
Plots the retention time against precursor mass space.
If what is "all", three (six if ion mobility is available
and ionmobility = TRUE (default is FALSE);
three additional plots with precursor mass against ion mobility)
such plots are created side by side: for the
identification peptides, the quantitation peptides and the
quantitation Pep3D data. If what is "some", a
subset of the rt/mass space can be defined with xlim
(default is c(40, 60)) and ylim (default is
c(1160, 1165)) and identification peptide, quantitation
peptides and EMRTs are presented on the same graph as grey
dots, blue dots and red crosses respectively. In addition,
rectangles based on the ppm and nsd defined tolerances (see
setPpmError and setNsdError) are drawn and
centered at the expected modelled retention time. This last
figure allows to visualise the EMRT matching.
signature(object = "Synapter",
key = "character", column = "character",
verbose = "logical", ...):
Plots two spectra and fragments against each other. Please see
plotFragmentMatching for details.
signature(object = "Synapter",
showAllPeptides = FALSE):
Creates two plots. The left panel shows the performance of filtering the
unique matches of the merged peptides using a different number of common
peaks. The right panel shows the performance of filtering the non unique
matches of the merged peptides using different differences (delta) in
common peaks/fragments. These differences (delta) are
calculated between the match with the highest number of common
peaks/fragments and the second highest one.
Use filterUniqueMatches and filterNonUniqueMatches to filter
the MatchedEMRT data.frame using one of these thresholds.
This function returns a list with two named elements (unqiue
and nonunqiue invisibly. These are the same data as return by
fragmentMatchingPerformance.
Use showAllPeptides=TRUE to add a line for all peptides (not just
the merged onces) to both plots.
signature(object = "Synapter",
what = c("fragments.ident", "spectra.quant")):
Plots the cumulative number of the fragments/peaks vs their intensity (log10
scaled). Use the what argument to create this plot for the
identification fragments (what = "fragments.quant") or the
the quantitation spectra (what = "spectra.quant").
signature(object="Synapter", file
= "character", what = "character", ...): Exports the merged
peptide data to a comma-separated file (default name is
"Res-MergedPeptides.csv").
signature(object="Synapter", file =
"character", ...): As above, saving the
matched EMRT table.
signature(object="Synapter", file
= "character", ...): As above, exporting the identification
peptide data.
signature(object="Synapter", file
= "character", ...): A above, exporting the quantitation
peptide data.
signature(object="Synapter"):
returns the identification fragments as MSnExp.
signature(object="Synapter"):
returns the quantitation spectra as MSnExp.
signature(x = "Synapter"): Coerce
object from Synapter to MSnSet class.
signature(object = "Synapter"):
Test whether a given Synapter object is valid.
signature(object = "Synapter"): Updates an old
Synapter object.
Laurent Gatto lg390@cam.ac.uk
Käll L, Storey JD, MacCoss MJ, Noble WS Posterior error probabilities and false discovery rates: two sides of the same coin. J Proteome Res. 2008a Jan; 7:(1)40-4
Bonferroni single-step adjusted p-values for strong control of the FWER.
Benjamini Y. and Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B., 1995, Vol. 57: 289-300.
Storey JD and Tibshirani R. Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences, 2003, 100: 9440-9445.
Käll, Storey JD, MacCoss MJ, Noble WS Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res. 2008b Jan; 7:(1)29-34
Improving qualitative and quantitative performance for MSE-based label free proteomics, N.J. Bond, P.V. Shliaha, K.S. Lilley and L. Gatto, Journal of Proteome Research, 2013, in press.
The Effects of Travelling Wave Ion Mobility Separation on Data Independent Acquisition in Proteomics Studies, P.V. Shliaha, N.J. Bond, L. Gatto and K.S. Lilley, Journal of Proteome Research, 2013, in press.
Trypsin cleavage:
Glatter, Timo, et al. Large-scale quantitative assessment of different in-solution protein digestion protocols reveals superior cleavage efficiency of tandem Lys-C/trypsin proteolysis over trypsin digestion. Journal of proteome research 11.11 (2012): 5145-5156. http://dx.doi.org/10.1021/pr300273g
Rodriguez, Jesse, et al. Does trypsin cut before proline?. Journal of proteome research 7.01 (2007): 300-305. http://dx.doi.org/10.1021/pr0705035
Brownridge, Philip, and Robert J. Beynon. The importance of the digest: proteolysis and absolute quantification in proteomics. Methods 54.4 (2011): 351-360. http://dx.doi.org/10.1016/j.ymeth.2011.05.005
cleaver's rules are taken from: http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html#Tryps
library(synapter) ## always needed
## Not run:
## (1) Construction - to create your own data objects
synapterTiny <- Synapter()
## End(Not run)
## let's use synapterTiny, shipped with the package
synapterTinyData() ## loads/prepares the data
synapterTiny ## show object
## (2) Filtering
## (2.1) Peptide scores and FDR
## visualise/explore peptide id scores
plotPepScores(synapterTiny)
getPepNumbers(synapterTiny)
## filter data
filterUniqueDbPeptides(synapterTiny) ## keeps unique proteotypic peptides
filterPeptideLength(synapterTiny, l = 7) ## default length is 7
## visualise before FDR filtering
plotFdr(synapterTiny)
setPepScoreFdr(synapterTiny, fdr = 0.01) ## optional
filterQuantPepScore(synapterTiny, fdr = 0.01) ## specifying FDR
filterIdentPepScore(synapterTiny) ## FDR not specified, using previously set value
## (2.2) Mass tolerance
getPpmErrorQs(synapterTiny)
plotPpmError(synapterTiny, what="Ident")
plotPpmError(synapterTiny, what="Quant")
setIdentPpmError(synapterTiny, ppm = 20) ## optional
filterQuantPpmError(synapterTiny, ppm = 20)
## setQuantPpmError(synapterTiny, ppm = 20) ## set quant ppm threshold below
filterIdentPpmError(synapterTiny, ppm=20)
filterIdentProtFpr(synapterTiny, fpr = 0.01)
filterQuantProtFpr(synapterTiny, fpr = 0.01)
getPpmErrorQs(synapterTiny) ## to be compared with previous output
## (3) Merge peptide sequences
mergePeptides(synapterTiny)
## (4) Retention time modelling
plotRt(synapterTiny, what="data")
setLowessSpan(synapterTiny, 0.05)
modelRt(synapterTiny) ## the actual modelling
getRtQs(synapterTiny)
plotRtDiffs(synapterTiny)
## plotRtDiffs(synapterTiny, xlim=c(-1, 1), breaks=500) ## pass parameters to hist()
plotRt(synapterTiny, what="model") ## using default nsd 1, 3, 5
plotRt(synapterTiny, what="model", nsd=0.5) ## better focus on model
plotFeatures(synapterTiny, what="all")
setRtNsd(synapterTiny, 3) ## RtNsd and PpmError are used for detailed plot
setPpmError(synapterTiny, 10) ## if not set manually, default values are set automatically
plotFeatures(synapterTiny, what="some", xlim=c(36,44), ylim=c(1161.4, 1161.7))
## best plotting to svg for zooming
set.seed(1) ## only for reproducibility of this example
## (5) Grid search to optimise EMRT matching parameters
searchGrid(synapterTiny,
ppms = 7:10, ## default values are 5, 7, ..., 20
nsds = 1:3, ## default values are 0.5, 1, ..., 5
subset = 0.2) ## default is 1
## alternatively, use 'n = 1000' to use exactly
## 1000 randomly selected features for the grid search
getGrid(synapterTiny) ## print the grid
getGridDetails(synapterTiny) ## grid details
plotGrid(synapterTiny, what = "total") ## plot the grid for total matching
plotGrid(synapterTiny, what = "model") ## plot the grid for matched modelled feature
plotGrid(synapterTiny, what = "details") ## plot the detail grid
getBestGridValue(synapterTiny) ## return best grid values
getBestGridParams(synapterTiny) ## return parameters corresponding to best values
setBestGridParams(synapterTiny, what = "auto") ## sets RtNsd and PpmError according the grid results
## 'what' could also be "model", "total" or "details"
## setPpmError(synapterTiny, 12) ## to manually set values
## setRtNsd(synapterTiny, 2.5)
## (6) Matching ident peptides and quant EMRTs
findEMRTs(synapterTiny)
plotEMRTtable(synapterTiny)
getEMRTtable(synapterTiny)
performance(synapterTiny)
performance2(synapterTiny)
## (7) Exporting data to csv spreadsheets
writeMergedPeptides(synapterTiny)
writeMergedPeptides(synapterTiny, file = "myresults.csv")
writeMatchedEMRTs(synapterTiny)
writeMatchedEMRTs(synapterTiny, file = "myresults2.csv")
## These will export the filter peptide data
writeIdentPeptides(synapterTiny, file = "myIdentPeptides.csv")
writeQuantPeptides(synapterTiny, file = "myQuantPeptides.csv")
## If used right after loading, the non-filted data will be exported
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.