PDtoMSstatsFormat: Generate MSstats required input format for Proteome...

Description Usage Arguments Value Author(s) Examples

View source: R/PDtoMSstatsFormat.R

Description

Convert Proteome discoverer output into the required input format for MSstats.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
PDtoMSstatsFormat(input,
      annotation,
      useNumProteinsColumn=FALSE,
      useUniquePeptide=TRUE,
      summaryforMultipleRows=max,
      fewMeasurements="remove",
      removeOxidationMpeptides=FALSE,
      removeProtein_with1Peptide=FALSE,
      which.quantification = 'Precursor.Area',
      which.proteinid = 'Protein.Group.Accessions',
      which.sequence = 'Sequence' )	

Arguments

input

name of Proteome discover PSM output, which is long-format. "Protein.Group.Accessions", "#Proteins", "Sequence", "Modifications", "Charge", "Intensity", "Spectrum.File" are required.

annotation

name of 'annotation.txt' or 'annotation.csv' data which includes Condition, BioReplicate, Run information. 'Run' will be matched with 'Spectrum.File'.

useNumProteinsColumn

TRUE removes peptides which have more than 1 in # Proteins column of PD output.

useUniquePeptide

TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.

summaryforMultipleRows

max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities.

fewMeasurements

'remove'(default) will remove the features that have 1 or 2 measurements across runs.

removeOxidationMpeptides

TRUE will remove the modified peptides including 'Oxidation (M)' in 'Modifications' column. FALSE is default.

removeProtein_with1Peptide

TRUE will remove the proteins which have only 1 peptide and charge. FALSE is default.

which.quantification

Use 'Precursor.Area'(default) column for quantified intensities. 'Intensity' or 'Area' can be used instead.

which.proteinid

Use 'Protein.Accessions'(default) column for protein name. 'Master.Protein.Accessions' can be used instead.

which.sequence

Use 'Sequence'(default) column for peptide sequence. 'Annotated.Sequence' can be used instead.

Value

data.frame with the required format of MSstats.

Author(s)

Meena Choi, Olga Vitek.

Maintainer: Meena Choi (mnchoi67@gmail.com)

Examples

1
2
3
4
5
# Please check section 4.5. 
## Suggested workflow with Proteome Discoverer output for DDA in MSstats user manual.
# Output of PDtoMSstatsFormat function should have the same 10 columns as an example dataset.

head(DDARawData)

Example output

  ProteinName PeptideSequence PrecursorCharge FragmentIon ProductCharge
1      bovine     S.PVDIDTK_5               5          NA            NA
2      bovine     S.PVDIDTK_5               5          NA            NA
3      bovine     S.PVDIDTK_5               5          NA            NA
4      bovine     S.PVDIDTK_5               5          NA            NA
5      bovine     S.PVDIDTK_5               5          NA            NA
6      bovine     S.PVDIDTK_5               5          NA            NA
  IsotopeLabelType Condition BioReplicate Run Intensity
1                L        C1            1   1   2636792
2                L        C1            1   2   1992418
3                L        C1            1   3   1982146
4                L        C2            1   4   5019594
5                L        C2            1   5   4560468
6                L        C2            1   6   3627849

MSstats documentation built on Feb. 28, 2021, 2:01 a.m.