specSimPepId: spectral similarity based adducted peptide identification for...

Description Usage Arguments Value Examples

View source: R/specSimPepId.R

Description

spectral similarity based adducted peptide identification for adductomicsR

Usage

1
2
3
4
5
6
specSimPepId(MS2Dir=NULL,nCores=NULL,
rtDevModels=NULL, topIons=100, topIntIt=5,minDotProd=0.8, precCh=3,
minSNR=3,minRt=20, maxRt=35, minIdScore=0.4,minFixed=3, minMz=750, 
maxMz=1000,modelSpec=c('ALVLIAFAQYLQQCPFEDHVK','RHPYFYAPELLFFAK'),
groupMzabs=0.005, groupRtDev=0.5, possFormMzabs=0.01,
minMeanSpecSim=0.7,idPossForm=0, outputPlotDir= NULL)

Arguments

MS2Dir

character a full path to a directory containing either .mzXML or .mzML data

nCores

numeric the number of cores to use for parallel computation. The default is to use 1 core.

rtDevModels

a list object or a full path to an RData file containing the retention time deviation models for the dataset.

topIons

numeric the number of most intense ions to consider for the basepeak to fragment mass difference calculation (default = 100). Larger values will slightly increase computation time, however when the modified/variable ions happen to be low abundance this value should be set high to ensure these fragment ions are considered.

topIntIt

numeric the number of most intense peaks to calculate the peak to peak mass differences from (default = 5 i.e. the base peak and the next 4 most intense ions greater than 10 daltons in mass from one another will be considered the multiple iterations increase computation time but in the case that the peptide spectrum is contaminated/chimeric or the variable ions are of lower intensity this parameter should be increased).

minDotProd

numeric minimum dot product similarity score (cosine) between the model spectra's variable ions and the corresponding intensities of the basepeak to fragment ion mass differences identified in the experimental spectrum scans (default = 0.8). Low values will greatly increase the potential for false positive peptide annotations.

precCh

integer charge state of precursors (default = 3).

minSNR

numeric the minimum signal to noise ratio for a fragment ion to be considered. The noise level for each fixed or variable ion is calculated by taking the median of the bottom half of ion intensities within the locality of the fragment ion. The locality is defined as within +/- 100 Daltons of the fragment ion.

minRt

numeric the minimum retention time (in minutes) within which to identify peptide spectra (default=20).

maxRt

numeric the maximum retention time (in minutes) within which to identify peptide spectra (default=45).

minIdScore

numeric the minimum identification score this is an average score of all of the 7 scoring metrics (default=0.4).

minFixed

numeric the minimum number of fixed fragment ions that must have been identified in a spectrum for it to be considered.

minMz

numeric the minimum mass-to-charge ratio of a precursor ion.

maxMz

numeric the maximum mass-to-charge ration of a precursor ion.

modelSpec

character full path to a model spectrum file (.csv). Alternatively built in model tables (in the extdata directory) can be used by just supplying the one letter amino acid code for the peptide (currently available are: "ALVLIAFAQYLQQCPFEDHVK" and "RHPYFYAPELLFFAK"). If supplying a custom table it must consist of the following mandatory columns ("mass", "intensity", "ionType" and "fixed or variable").

  1. mass - m/z of fragment ions.

  2. intensity - intensity of fragment ions can be either relative or absolute intensity

  3. ionType - the identity of the B and Y fragments can optionally added here (e.g. [b6]2+, [y2]1+) or if not known such as for mixed disulfates this column can also contain empty fields.

  4. fixed or variable - this column contains whether a fragment ion should be considered either 'fixed', 'variable' (i.e. modified) or if it is an empty field it will not be considered.

As default the following model spectra are included in the external data directory of the adductomics package:

  1. 'modelSpectrum_ALVLIAFAQYLQQCPFEDHVK.csv'

  2. 'modelSpectrum_RHPYFYAPELLFFAK.csv'

groupMzabs

numeric after hierarchical clustering of the spectra the dendrogram will be cut at this height (in Da) generating the mass groups.

groupRtDev

numeric after hierarchical clustering of the spectra the dendrogram will be cut at this height (in minutes) generating the retention time groups.

possFormMzabs

numeric the maximum absolute mass difference for matching adduct mass to possible formulae.

minMeanSpecSim

numeric minimum mean dot product similarity score (cosine) between the spectra of a group identified by hierarchical clustering. This parameter is set to prevent erroneous clustering of dissimilar spectra (default = 0.7).

idPossForm

integer if = 1 then the average adduct masses of each spectrum group will be matched against an internal database of possible formula to generate hypotheses. The default 0 mean this will not take place as the computation is potentially time consuming.

outputPlotDir

character (default = NULL) output directory for plots.

Value

dataframe of putative adducts

Examples

1
2
3
4
5
6
7
## Not run: 
eh = ExperimentHub();
temp = query(eh, 'adductData');
specSimPepId(MS2Dir=hubCache(temp),nCores=2,
rtDevModels=paste0(hubCache(temp),'/rtDevModels.RData'))

## End(Not run)

adductomicsR documentation built on Nov. 8, 2020, 4:49 p.m.