infer_parsimonious_accessions: Eliminates Redundancy in Peptide-to-Protein Mapping

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Infer parsimonious set of accessions (e.g. proteins) that explains all the peptide sequences. The algorithm is a simple loop that looks for the accession explaining most peptides, records the peptide-to-accession mapping for this accession, removes those peptides, and then looks for next best accession. The loop continues until no peptides left. The method does not accept any arguments at this point (except the MSnID object itself).

Usage

1
    infer_parsimonious_accessions(object, unique_only=FALSE, prior=character(0))

Arguments

object

An instance of class "MSnID".

unique_only

If TRUE, peptides mapping to multiple accessions are dropped and only unique are retained. If FALSE, then shared peptides assigned according to Occam's razor rule. That is a shared peptide is assigned to a protein with larger number of unique peptides. If the number of unique peptides is the same, then to the first accession. Default is FALSE.

prior

(character) character vector with prior justified proteins/accessions. If unique_only == TRUE, then prior argument is ignored. Essentially evidence by presense of unique peptide supercedes any prior. Default is character(0), that is none.

Details

Although the algorithm is rather simple it is THE algorithm used for inferring maximal matching in bipartate graphs and is used in the IDPicker software.

Value

Returns an instance of "MSnID" with minimal set of proteins necessary to explain all the peptide sequences.

Author(s)

Vladislav A Petyuk vladislav.petyuk@pnnl.gov

See Also

MSnID

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(c_elegans)

# explicitely adding parameters that will be used for data filtering
msnidObj$msmsScore <- -log10(msnidObj$`MS-GF:SpecEValue`)
msnidObj$absParentMassErrorPPM <- abs(mass_measurement_error(msnidObj))

# quick-and-dirty filter. The filter is too strong for the sake of saving time
# at the minimal set of proteins inference step.
msnidObj <- apply_filter(msnidObj, 'msmsScore > 12 & absParentMassErrorPPM < 2')

show(msnidObj)
msnidObj2 <- infer_parsimonious_accessions(msnidObj)
show(msnidObj2)

vladpetyuk/MSnID documentation built on June 25, 2021, 6:35 a.m.