ProteinInference: Protein inference for aLFQ import data frame

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Protein inference for aLFQ import data frame.

Usage

1
2
3
4
5
6
7
8
9
## Default S3 method:
ProteinInference(data, peptide_method = "top", peptide_topx = 2, 
peptide_strictness = "strict",peptide_summary = "mean", transition_topx = 3, 
transition_strictness = "strict",transition_summary = "sum", fasta = NA, 
apex_model = NA, combine_precursors = FALSE, combine_peptide_sequences = FALSE, 
consensus_proteins = TRUE, consensus_peptides = TRUE, 
consensus_transitions = TRUE, scampi_method = "LSE", 
scampi_iterations = 10, scampi_outliers = FALSE, scampi_outliers_iterations = 2, 
scampi_outliers_threshold = 2, ...)

Arguments

data

a mandatory data frame containing the columns "run_id", "protein_id", "protein_intensity", and "concentration" for quantification on the protein level. For quantification on the peptide level, the columns "run_id", "protein_id", "peptide_id", "peptide_sequence", "precursor_charge", "peptide_intensity" and "concentration" are required. For quantification on the transition level, the columns "protein_id", "peptide_id", "transition_id", "peptide_sequence", "precursor_charge", "transition_intensity" and "concentration" are required. The id columns can be defined in any format, while the "_intensity" and "concentration" columns need to be numeric and in non-log form. The data may contain calibration data (with numeric "concentration" and test data (with "concentration" = "?"))

peptide_method

one of "top", "all", "iBAQ", "APEX", "NSAF" or "SCAMPI" peptide to protein intensity estimation methods.

peptide_topx

("top" only:) a positive integer value of the top x peptides to consider for "top" methods.

peptide_strictness

("top" only:) whether peptide_topx should only consider proteins with the minimal peptide number ("strict") or all ("loose").

peptide_summary

("top" and "all" only:) how to summarize the peptide intensities: "mean", "median", "sum".

transition_topx

a positive integer value of the top x transitions to consider for transition to peptide intensity estimation methods.

transition_strictness

whether transition_topx should only consider peptides with the minimal transition number ("strict") or all ("loose").

transition_summary

how to summarize the transition intensities: "mean", "median", "sum".

fasta

("iBAQ", "APEX", "NSAF" and "SCAMPI" only:) the path and filename to an amino acid fasta file containing the proteins of interest.

apex_model

("APEX" only:) The "APEX" model to use (see APEX).

combine_precursors

whether to sum all precursors of the same peptide.

combine_peptide_sequences

whether to sum all variants (modifications) of the same peptide sequence.

consensus_proteins

if multiple runs are provided, select identical proteins among all runs.

consensus_peptides

if multiple runs are provided, select identical peptides among all runs.

consensus_transitions

if multiple runs are provided, select identical transitions among all runs.

scampi_method

(SCAMPI only:) Describes which method should be used for the parameter estimation. Available: method="LSE", method="MLE". See details of runScampi or iterateScampi.

scampi_iterations

(SCAMPI only:) Only used with scampi_method="MLE". See details of runScampi or iterateScampi.

scampi_outliers

(SCAMPI only:) Whether runScampi (FALSE) or iterateScampi (TRUE) should be used. See details of runScampi or iterateScampi.

scampi_outliers_iterations

(SCAMPI only:) Number of estimation/outlier-removal iterations to be performed. See details of iterateScampi.

scampi_outliers_threshold

(SCAMPI only:) Constant to tune the outlier selection process. See details of iterateScampi.

...

future extensions.

Details

The ProteinInference module provides functionality to infer protein quantities from the measured precursor or fragment intensities or peptide spectral counts. If the dataset contains targeted MS2-level data, the paired precursor and fragment ion signals, the transitions, are first summarized to the precursor level. Different methods for aggregation can be specified, e.g. sum, mean or median and a limit for the selection of the most intense transitions can be provided. It is further possible to exclude precursors, which do not have sufficient transitions to fulfill this boundary. To summarize precursor intensities or spectral counts to theoretical protein intensities, the mean, TopN (Silva et al., 2006; Malmstrom et al., 2009; Schmidt et al., 2011; Ludwig et al., 2012), APEX (Lu et al., 2006), iBAQ (Schwanhausser et al., 2011), NSAF (Zybailov et al., 2006) and SCAMPI (Gerster et al., 2014) protein intensity estimators are provided. For APEX, iBAQ, NSAF and SCAMPI, the protein database in FASTA format needs to be supplied. In terms of targeted data acquisition, for both APEX and iBAQ methods all peptides of a protein must be targeted. The results are reported in the same unified data structure as from the previous step

Value

A standard aLFQ import data frame on protein level.

Author(s)

George Rosenberger gr2578@cumc.columbia.edu

References

Silva, J. C., Gorenstein, M. V., Li, G.-Z., Vissers, Johannes P. C. & Geromanos, S. J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell Proteomics 5, 144-156 (2006).

Malmstrom, J. et al. Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans. Nature 460, 762-765 (2009).

Schmidt, A. et al. Absolute quantification of microbial proteomes at different states by directed mass spectrometry. Molecular Systems Biology 7, 1-16 (2011).

Ludwig, C., Claassen, M., Schmidt, A. & Aebersold, R. Estimation of Absolute Protein Quantities of Unlabeled Samples by Selected Reaction Monitoring Mass Spectrometry. Molecular & Cellular Proteomics 11, M111.013987-M111.013987 (2012).

Lu, P., Vogel, C., Wang, R., Yao, X. & Marcotte, E. M. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotech 25, 117-124 (2006).

Schwanhausser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337-342 (2011).

Zybailov, B. et al. Statistical Analysis of Membrane Proteome Expression Changes in Saccharomyces c erevisiae. J. Proteome Res. 5, 2339-2347 (2006).

Gerster S., Kwon T., Ludwig C., Matondo M., Vogel C., Marcotte E. M., Aebersold R., Buhlmann P. Statistical approach to protein quantification. Molecular & Cellular Proteomics 13, M112.02445 (2014).

See Also

import, AbsoluteQuantification, ALF, APEX, apexFeatures, proteotypic, runScampi, iterateScampi

Examples

1
2
3
4
data(UPS2MS)

data_ProteinInference <- ProteinInference(UPS2_SRM)
print(data_ProteinInference)

aLFQ/aLFQ documentation built on May 10, 2019, 1:59 a.m.