DDARawData.Skyline: Example dataset from a label-free DDA, a controlled spike-in...
In MSstats: Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

Description Usage Format Details Value Author(s) References Examples

This is a data set obtained from a published study (Mueller, et. al, 2007). A controlled spike-in experiment, where 6 proteins, (horse myoglobin, bovine carbonic anhydrase, horse Cytochrome C, chicken lysozyme, yeast alcohol dehydrogenase, rabbit aldolase A) were spiked into a complex background in known concentrations in a latin square design. The experiment contained 6 mixtures, and each mixture was analyzed in label-free LC-MS mode with 3 technical replicates (resulting in the total of 18 runs). Each protein was represented by 7-21 peptides, and each peptide was represented by 1-5 transition. Skyline is used for processing.

1	DDARawData.Skyline

data.frame

The raw data (input data for MSstats) is required to contain variable of ProteinName, PeptideSequence, PrecursorCharge, FragmentIon, ProductCharge, IsotopeLabelType, Condition, BioReplicate, Run, Intensity. The variable names should be fixed.

This is 'MSstats input' format from Skyline used by 'MSstats_report.skyr'. The column names, 'FileName' and 'Area', should be changed to 'Run' and 'Intensity'. There are two extra columns called 'StandardType' and 'Truncated'.'StandardType' column can be used for normalization='globalStandard' in dataProcess. 'Truncated' columns can be used to remove the truncated peaks with skylineReport=TRUE in dataProcess.

If the information of one or more columns is not available for the original raw data, please retain the column variables and type in fixed value. For example, the original raw data does not contain the information of PrecursorCharge and ProductCharge, we retain the column PrecursorCharge and ProductCharge and then type in NA for all transitions in RawData.

Variable Intensity is required to be original signal without any log transformation and can be specified as the peak of height or the peak of area under curve.

data.frame with the required format of MSstats.

Meena Choi, Olga Vitek.

Maintainer: Meena Choi (mnchoi67@gmail.com)

Meena Choi, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean and Olga Vitek. "MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments" Bioinformatics, 30(17):1514-1526, 2014.

Timothy Clough, Safia Thaminy, Susanne Ragg, Ruedi Aebersold, Olga Vitek. "Statistical protein quantification and significance analysis in label-free LC-M experiments with complex designs" BMC Bioinformatics, 13:S16, 2012.

1	head(DDARawData.Skyline)

  ProteinName  PeptideSequence PrecursorCharge FragmentIon ProductCharge
1      bovine HWGSSDDQGSEHTVDR               2   precursor             2
2      bovine HWGSSDDQGSEHTVDR               2   precursor             2
3      bovine HWGSSDDQGSEHTVDR               2   precursor             2
4      bovine HWGSSDDQGSEHTVDR               2   precursor             2
5      bovine HWGSSDDQGSEHTVDR               2   precursor             2
6      bovine HWGSSDDQGSEHTVDR               2   precursor             2
  IsotopeLabelType Condition BioReplicate         FileName    Area StandardType
1            light         1            1 B06-8004_c.mzXML 1015466           NA
2            light         3            2 B06-8006_c.mzXML  907841           NA
3            light         5            3 B06-8008_c.mzXML 1263905           NA
4            light         2            4 B06-8010_c.mzXML 2457121           NA
5            light         4            5 B06-8012_c.mzXML  958204           NA
6            light         6            6 B06-8014_c.mzXML  788090           NA
  Truncated
1     False
2     False
3     False
4     False
5     False
6     False