map_peptides: Map peptides to their locations within a protein

Description Usage Arguments Value Examples

Description

Takes a ThermoFisher MSF file and finds the location of each peptide within its corresponding protein sequence. In cases where a single peptide maps to multiple locations within a protein sequence, only the first location is reported. If a peptide maps ambiguously to multiple proteins, all locations are reported with data from each peptide-protein combination on a separate row.

Usage

1
map_peptides(msf_file, min_conf = "High", prot_regex = "")

Arguments

msf_file

A file path to a ThermoFisher MSF file.

min_conf

"High", "Medium", or "Low". The minimum peptide confidence level to retrieve from MSF file.

prot_regex

Regular expression where the first group matches a protein name or ID from the protein description. Regex must contain ONE group. The protein description is typically generated from a fasta reference file that was used for the database search.

Value

A dataframe containing start and stop positions (relative to the parent protein sequence) for each peptide in the database.

peptide_id

a unique peptide ID

spectrum_id

a unique spectrum ID

protein_id

unique protein group ID to which this peptide maps

protein_desc

protein description from reference database used to assign peptides to protein groups, parsed according to prot_regex

peptide_sequence

amino acid sequence (does not show post-translational modifications)

pep_score

PEP score

q_value

Q-value score

protein_sequence

parent protein sequence

start

start position of peptide within protein sequence

end

end position of peptide within protein sequence

Examples

1

benjaminjack/parsemsf documentation built on May 12, 2019, 11:54 a.m.