Description Usage Arguments Value Examples
View source: R/proteinSummarization.R
We assume missing values are censored and then impute the missing values. Protein-level summarization from peptide level quantification are performed. After all, global median normalization on peptide level data and normalization between MS runs using reference channels will be implemented.
1 2 3 4 5 6 7 8 9 10 |
data |
Name of the output of PDtoMSstatsTMTFormat function or peptide-level quantified data from other tools. It should have columns ProteinName, PeptideSequence, Charge, PSM, Mixture, TechRepMixture, Run, Channel, Condition, BioReplicate, Intensity |
method |
Four different summarization methods to protein-level can be performed : "msstats"(default), "MedianPolish", "Median", "LogSum". |
global_norm |
Global median normalization on peptide level data (equalizing the medians across all the channels and MS runs). Default is TRUE. It will be performed before protein-level summarization. |
reference_norm |
Reference channel based normalization between MS runs on protein level data. TRUE(default) needs at least one reference channel in each MS run, annotated by 'Norm' in Condtion column. It will be performed after protein-level summarization. FALSE will not perform this normalization step. If data only has one run, then reference_norm=FALSE. |
remove_norm_channel |
TRUE(default) removes 'Norm' channels from protein level data. |
remove_empty_channel |
TRUE(default) removes 'Empty' channels from protein level data. |
MBimpute |
only for method="msstats". TRUE (default) imputes missing values by Accelated failure model. FALSE uses minimum value to impute the missing value for each peptide precursor ion. |
maxQuantileforCensored |
We assume missing values are censored. maxQuantileforCensored is Maximum quantile for deciding censored missing value, for instance, 0.999. Default is Null. |
data.frame with protein-level summarization for each run and channel
1 2 3 4 5 6 7 | data(input.pd)
quant.pd.msstats <- proteinSummarization(input.pd,
method="msstats",
global_norm=TRUE,
reference_norm=TRUE)
head(quant.pd.msstats)
|
Joining, by = c("Run", "Channel")
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw ( 1 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 4-29
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 1
-> [K].sTPSGFTLDDVIQTGVDNPGHPYIMTVGcVAGDEESYEVFk.[D]_4_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_02.raw ( 2 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 3-33
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 1
-> [R].eVLGDAVPDEILIEAVLk.[N]_3_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_03.raw ( 3 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 3-29
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 1
-> [K].qQQDQVDr.[N]_2_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture2_01.raw ( 4 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 1-28
# of Transitions/Peptide 1-1
** 1 Proteins have only single transition : Consider excluding this protein from the dataset. (Q9Y450)
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 1
-> [K].dYEFMWNPHLGYILTcPSNLGTGLr.[A]_3_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture2_02.raw ( 5 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 1-30
# of Transitions/Peptide 1-1
** 1 Proteins have only single transition : Consider excluding this protein from the dataset. (Q9Y450)
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 2
-> [K].qQQDQVDr.[N]_2_NA_NA, [R].nLPQYVSNELLEEAFSVFGQVEr.[A]_3_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture2_03.raw ( 6 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 2-30
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 0
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_01.raw ( 7 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 4-31
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 0
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_02.raw ( 8 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 3-30
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 1
-> [K].vDIVAINDPFIDLNYMVYMFQYDSTHGk.[F]_3_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw ( 9 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 5-30
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 3
-> [K].vDIVAINDPFIDLNYMVYMFQYDSTHGk.[F]_3_NA_NA, [R].iPSAVGYQPTLATDMGTMQEr.[I]_2_NA_NA, [R].gAMPPAPVPAGTPAPPGPATMMPDGTLGLTPPTTEr.[F]_4_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture4_01.raw ( 10 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 3-31
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 2
-> [K].sTPSGFTLDDVIQTGVDNPGHPYIMTVGcVAGDEESYEVFk.[D]_4_NA_NA, [K].qQQDQVDr.[N]_2_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture4_02.raw ( 11 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 3-31
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 1
-> [R].fcTGLTQIETLFk.[S]_2_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture4_03.raw ( 12 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 1-31
# of Transitions/Peptide 1-1
** 1 Proteins have only single transition : Consider excluding this protein from the dataset. (Q9Y450)
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 0
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture5_01.raw ( 13 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 3-34
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 0
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture5_02.raw ( 14 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 2-30
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 3
-> [R].mGQMAMGGAmGINNr.[G]_2_NA_NA, [R].nLPQYVSNELLEEAFSVFGQVER.[A]_3_NA_NA, [R].dQNAEQIr.[L]_2_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Summarizing for Run : 161117_SILAC_HeLa_UPS1_TMT10_Mixture5_03.raw ( 15 of 15 )
** Use all features that the dataset origianally has.
Summary of Features :
count
# of Protein 10
# of Peptides/Protein 5-32
# of Transitions/Peptide 1-1
Summary of Samples :
0.125 0.5 0.667 1 Norm
# of MS runs 2 2 2 2 2
# of Biological Replicates 1 1 1 1 1
# of Technical Replicates 2 2 2 2 2
Summary of Missingness :
# transitions are completely missing in at least one of the conditions : 4
-> [K].gFQQILAGEYDHLPEQAFYmVGPIEEAVAk.[A]_3_NA_NA, [K].qFAPIHAEAPEFMEMSVEQEILVTGIk.[V]_4_NA_NA, [K].qQQDQVDr.[N]_2_NA_NA, [R].dQNAEQIr.[L]_2_NA_NA ...
# run with 75% missing observations: 0
== Start the summarization per subplot...
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%
== the summarization per subplot is done.
** Protein-level summarization done by MSstats.
Normalization between MS runs for Protein : P04406 ( 1 of 10 )
Normalization between MS runs for Protein : P06576 ( 2 of 10 )
Normalization between MS runs for Protein : P12277 ( 3 of 10 )
Normalization between MS runs for Protein : P23919 ( 4 of 10 )
Normalization between MS runs for Protein : P31947 ( 5 of 10 )
Normalization between MS runs for Protein : Q15233 ( 6 of 10 )
Normalization between MS runs for Protein : Q16181 ( 7 of 10 )
Normalization between MS runs for Protein : Q9NSD9 ( 8 of 10 )
Normalization between MS runs for Protein : Q9UGP8 ( 9 of 10 )
Normalization between MS runs for Protein : Q9Y450 ( 10 of 10 )
Run Protein Abundance Channel
1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.59812 127C
2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.55729 129N
3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.71783 128N
4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.67190 129C
5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.51106 127N
6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw P04406 16.49448 130C
BioReplicate Condition TechRepMixture Mixture
1 Mixture1_0.125 0.125 1 Mixture1
2 Mixture1_0.125 0.125 1 Mixture1
3 Mixture1_0.5 0.5 1 Mixture1
4 Mixture1_0.5 0.5 1 Mixture1
5 Mixture1_0.667 0.667 1 Mixture1
6 Mixture1_0.667 0.667 1 Mixture1
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.