specSimPepId: spectral similarity based adducted peptide identification for...
In adductomicsR: Processing of adductomic mass spectral datasets

Description Usage Arguments Value Examples

spectral similarity based adducted peptide identification for adductomicsR

specSimPepId(MS2Dir=NULL,nCores=NULL,
rtDevModels=NULL, topIons=100, topIntIt=5,minDotProd=0.8, precCh=3,
minSNR=3,minRt=20, maxRt=35, minIdScore=0.4,minFixed=3, minMz=750, 
maxMz=1000,modelSpec=c('ALVLIAFAQYLQQCPFEDHVK','RHPYFYAPELLFFAK'),
groupMzabs=0.005, groupRtDev=0.5, possFormMzabs=0.01,
minMeanSpecSim=0.7,idPossForm=0, outputPlotDir= NULL)

`MS2Dir`	character a full path to a directory containing either .mzXML or .mzML data
`nCores`	numeric the number of cores to use for parallel computation. The default is to use 1 core.
`rtDevModels`	a list object or a full path to an RData file containing the retention time deviation models for the dataset.
`topIons`	numeric the number of most intense ions to consider for the basepeak to fragment mass difference calculation (default = 100). Larger values will slightly increase computation time, however when the modified/variable ions happen to be low abundance this value should be set high to ensure these fragment ions are considered.
`topIntIt`	numeric the number of most intense peaks to calculate the peak to peak mass differences from (default = 5 i.e. the base peak and the next 4 most intense ions greater than 10 daltons in mass from one another will be considered the multiple iterations increase computation time but in the case that the peptide spectrum is contaminated/chimeric or the variable ions are of lower intensity this parameter should be increased).
`minDotProd`	numeric minimum dot product similarity score (cosine) between the model spectra's variable ions and the corresponding intensities of the basepeak to fragment ion mass differences identified in the experimental spectrum scans (default = 0.8). Low values will greatly increase the potential for false positive peptide annotations.
`precCh`	integer charge state of precursors (default = 3).
`minSNR`	numeric the minimum signal to noise ratio for a fragment ion to be considered. The noise level for each fixed or variable ion is calculated by taking the median of the bottom half of ion intensities within the locality of the fragment ion. The locality is defined as within +/- 100 Daltons of the fragment ion.
`minRt`	numeric the minimum retention time (in minutes) within which to identify peptide spectra (default=20).
`maxRt`	numeric the maximum retention time (in minutes) within which to identify peptide spectra (default=45).
`minIdScore`	numeric the minimum identification score this is an average score of all of the 7 scoring metrics (default=0.4).
`minFixed`	numeric the minimum number of fixed fragment ions that must have been identified in a spectrum for it to be considered.
`minMz`	numeric the minimum mass-to-charge ratio of a precursor ion.
`maxMz`	numeric the maximum mass-to-charge ration of a precursor ion.
`modelSpec`	character full path to a model spectrum file (.csv). Alternatively built in model tables (in the extdata directory) can be used by just supplying the one letter amino acid code for the peptide (currently available are: "ALVLIAFAQYLQQCPFEDHVK" and "RHPYFYAPELLFFAK"). If supplying a custom table it must consist of the following mandatory columns ("mass", "intensity", "ionType" and "fixed or variable"). mass - m/z of fragment ions. intensity - intensity of fragment ions can be either relative or absolute intensity ionType - the identity of the B and Y fragments can optionally added here (e.g. [b6]2+, [y2]1+) or if not known such as for mixed disulfates this column can also contain empty fields. fixed or variable - this column contains whether a fragment ion should be considered either 'fixed', 'variable' (i.e. modified) or if it is an empty field it will not be considered. As default the following model spectra are included in the external data directory of the adductomics package: 'modelSpectrum_ALVLIAFAQYLQQCPFEDHVK.csv' 'modelSpectrum_RHPYFYAPELLFFAK.csv'
`groupMzabs`	numeric after hierarchical clustering of the spectra the dendrogram will be cut at this height (in Da) generating the mass groups.
`groupRtDev`	numeric after hierarchical clustering of the spectra the dendrogram will be cut at this height (in minutes) generating the retention time groups.
`possFormMzabs`	numeric the maximum absolute mass difference for matching adduct mass to possible formulae.
`minMeanSpecSim`	numeric minimum mean dot product similarity score (cosine) between the spectra of a group identified by hierarchical clustering. This parameter is set to prevent erroneous clustering of dissimilar spectra (default = 0.7).
`idPossForm`	integer if = 1 then the average adduct masses of each spectrum group will be matched against an internal database of possible formula to generate hypotheses. The default 0 mean this will not take place as the computation is potentially time consuming.
`outputPlotDir`	character (default = NULL) output directory for plots.

dataframe of putative adducts

## Not run: 
eh = ExperimentHub();
temp = query(eh, 'adductData');
specSimPepId(MS2Dir=hubCache(temp),nCores=2,
rtDevModels=paste0(hubCache(temp),'/rtDevModels.RData'))

## End(Not run)