synergise2: Synergise identification and quantitation results

synergise2R Documentation

Synergise identification and quantitation results

Description

Performs a complete default analysis on the files defined in filenames, creates a complete html report and saves/exports all results as csv and rds files. See details for a description of the pipeline and Synapter for manual execution of individual steps.

Usage

synergise2(filenames, master = FALSE, object, outputdir,
  outputfile = paste0("synapter_report_", strftime(Sys.time(),
  "%Y%m%d-%H%M%S"), ".html"), fdr = 0.01, fdrMethod = c("BH",
  "Bonferroni", "qval"), fpr = 0.01, peplen = 7, missedCleavages = 2,
  IisL = FALSE, identppm = 20, quantppm = 20, uniquepep = TRUE,
  span.rt = 0.05, span.int = 0.05, grid.ppm.from = 2,
  grid.ppm.to = 20, grid.ppm.by = 2, grid.nsd.from = 0.5,
  grid.nsd.to = 5, grid.nsd.by = 0.5, grid.imdiffs.from = 0.6,
  grid.imdiffs.to = 1.6, grid.imdiffs.by = 0.2, grid.subset = 1,
  grid.n = 0, grid.param.sel = c("auto", "model", "total", "details"),
  fm.ppm = 25, fm.ident.minIntensity = 70,
  fm.quant.minIntensity = 70, fm.minCommon = 1, fm.minDelta = 1,
  fm.fdr.unique = 0.05, fm.fdr.nonunique = 0.05,
  mergedEMRTs = c("rescue", "copy", "transfer"),
  template = system.file("reports", "synergise2.Rmd", package =
  "synapter"), verbose = interactive())

Arguments

filenames

A named list of file names to be load. The names must be identpeptide, quantpeptide, quantpep3d and fasta (could be an RDS file created by link{createUniquePeptideDbRds}). If fragmentmatching should be used identfragments (could be skipped if a master RDS files is used for identpeptide) and quantspectra have to be given as well. identpeptide can be a csv final peptide file (from PLGS) or a saved "MasterPeptides" data object as created by makeMaster if working with master peptide data. To serialise the "MasterPeptides" instance, use the saveRDS function, and file extenstion rds.

master

A logical indicating if the identification final peptide files are master (see makeMaster) or regular files. Default is FALSE.

object

An instance of class Synapter that will be copied, processed and returned. If filenames are also provided, the latter and object's inputFiles will be checked for equality.

outputdir

A character with the full path to an existing directory.

outputfile

A character with the file name for the report.

fdr

Peptide false discovery rate. Default is 0.01.

fdrMethod

P-value adjustment method. One of "BH" (default) for Benjamini and HochBerg (1995), "Bonferroni" for Bonferroni's single-step adjusted p-values for strong control of the FWER and "qval" from the qvalue package. See Synapter for references.

fpr

Protein false positive rate. Default is 0.01.

peplen

Minimum peptide length. Default is 7.

missedCleavages

Number of allowed missed cleavages. Default is 2.

IisL

If TRUE Isoleucin and Leucin are treated as equal. In this case sequences like "ABCI", "ABCL" are removed because they are not unqiue. If FALSE (default) "ABCI" and "ABCL" are reported as unique.

identppm

Identification mass tolerance (in ppm). Default is 20.

quantppm

Quantitation mass tolerance (in ppm). Default is 20.

uniquepep

A logical is length 1 indicating if only unique peptides in the identification and quantitation peptides as well as unique tryptic peptides as defined in the fasta file. Default is TRUE.

span.rt

The loess span parameter for retention time correction. Default is 0.05.

span.int

The loess span parameter for intensity correction. Default is 0.05.

grid.ppm.from

Mass tolerance (ppm) grid starting value. Default is 2.

grid.ppm.to

Mass tolerance (ppm) grid ending value. Default is 20.

grid.ppm.by

Mass tolerance (ppm) grid step value. Default is 2.

grid.nsd.from

Number of retention time stdev grid starting value. Default is 0.5.

grid.nsd.to

Number of retention time stdev grid ending value. Default is 5.

grid.nsd.by

Number of retention time stdev grid step value. Default is 0.5.

grid.imdiffs.from

Ion mobility difference grid starting value. value. Default is 0.6.

grid.imdiffs.to

Ion mobility difference grid ending value. Default is 1.6.

grid.imdiffs.by

Ion mobility difference grid step value. Default is 0.2.

grid.subset

Percentage of features to be used for the grid search. Default is 1.

grid.n

Absolute number of features to be used for the grid search. Default is 0, i.e ignored.

grid.param.sel

Grid parameter selection method. One of auto (default), details, model or total. See Synapter for details on these selection methods.

fm.ppm

Fragment Matching tolerance: Peaks in a range of fm.ppm are considered as identical. Default is 25.

fm.ident.minIntensity

Minimal intensity of a Identification fragment to be not filtered prior to Fragment Matching.

fm.quant.minIntensity

Minimal intensity of a peaks in a Quantitation spectra to be not filtered prior to Fragment Matching.

fm.minCommon

Minimal number of peaks that unique matches need to have in common. Default 1.

fm.minDelta

Minimal difference in number of peaks that non-unique matches need to have to be considered as true match. Default 1.

fm.fdr.unique

Minimal FDR to select fm.minCommon automatically (if both values are given the more restrictive one (that filters more fragments) is used). Default 0.05.

fm.fdr.nonunique

Minimal FDR to select fm.minDelta automatically (if both values are given the more restrictive one (that filters more fragments) is used). Default 0.05.

mergedEMRTs

One of "rescue" (default), "copy" or "transfer". See the documentation for the findEMRTs function in Synapter for details.

template

A character full path to Rmd template.

verbose

A logical indicating if progress output should be printed to the console. Default is TRUE.

Details

In contrast to synergise1 synergise2 extends the default analysis and offers the follwing unique features:

  • Performing 3D grid search (M/Z, Retention Time, Ion Mobility) for HDMSE data.

  • Applying intensity correction.

  • Filtering results by fragment matching.

Data can be input as a Synapter object if available or as a list of files (see filenames) that will be used to read the data in. The html report and result files will be created in the outputdir folder. All other input parameters have default values.

The data processing and analysis pipeline is as follows:

  1. If uniquepep is set to TRUE (default), only unique proteotypic identification and quantitation peptides are retained.

  2. Peptides are filtered for a FDR <= fdr (default is 0.01) using the "BH" method (see fdr and fdrMethod parameters for details).

  3. Peptide with a mass tolerance > 20 ppms (see quantppm and identppm) are filtered out.

  4. Peptides with a protein false positive rate (as reported by the PLGS software) > fpr are filtered out.

  5. Common identification and quantitation peptides are merged and a retention time model is created using the Local Polynomial Regression Fitting (loess function for the stats package) using a default span.rt value of 0.05.

  6. A grid search to optimise the width in retention time and mass tolerance (and ion mobility for HDMSE) for EMRTs matching is performed. The default grid search space is from 0.5 to 5 by 0.5 retention time model standard deviations (see grid.nsd.from, grid.nsd.to and grid.nsd.by parameters) and from 2 to 20 by 2 parts per million (ppm) for mass tolerance (see grid.ppm.from, grid.ppm.to and grid.ppm.by parameters). If HDMSE data are used the search space is extended from ion mobility difference 0.6 to 1.6 by 0.2 (see grid.imdiffs.from, grid.imdiffs.to and grid.imdiffs.by). The data can be subset using using an absolute number of features (see grid.n) or a fixed percentage (see grid.subset). The pair of optimal nsd and ppm is chosen (see grid.param.sel parameter).

  7. Fragment Matching is used to filter false-positive matches from the grid search using a default of 1 common peak for unique matches and at least a difference of 1 common peaks to choose the correct match out of non-unique matches (see fm.minCommon and fm.minDelta).

  8. The intensity is corrected by a Local Polynomial Regression Fitting (loess function for the stats package) using a default span.int value of 0.05.

  9. The quantitation EMRTs are matched using the optimised parameters.

If a master identification file is used (master is set to TRUE, default is FALSE), the relevant actions that have already been executed when the file was created with makeMaster are not repeated here.

Value

Invisibly returns an object of class Synapter. Used for its side effect of creating an html report of the run in outputdir.

Author(s)

Laurent Gatto, Sebastian Gibb

References

Bond N. J., Shliaha P.V., Lilley K.S. and Gatto L. (2013) J. Prot. Research.

Examples

## Not run: 
library(synapterdata)
data(synobj2)
output <- tempdir() ## a temporary directory
synergise2(object = synobj2, outputdir = output, outputfile = "synapter.html")
htmlReport <- paste0("file:///", file.path(output, "synapter.html")) ## the result report
browseURL(htmlReport) ## open the report with default browser

## End(Not run)

lgatto/synapter documentation built on Sept. 28, 2022, 6:53 a.m.