extract_pyprophet_data: Read a bunch of scored swath outputs from pyprophet to...

View source: R/proteomics.R

extract_pyprophet_dataR Documentation

Read a bunch of scored swath outputs from pyprophet to acquire their metrics.

Description

This function is mostly cribbed from the other extract_ functions in this file. With it, I hope to be able to provide some metrics of a set of openswath runs, thus potentially opening the door to being able to objectively compare the same run with different options and/or different runs.

Usage

extract_pyprophet_data(
  metadata,
  pyprophet_column = "diascored",
  savefile = NULL,
  ...
)

Arguments

metadata

Data frame describing the samples, including the mzXML filenames.

pyprophet_column

Which column from the metadata provides the requisite filenames?

savefile

If not null, save the data from this to the given filename.

...

Extra arguments, presumably color palettes and column names and stuff like that.

Details

Likely columns generated by exporting OpenMS data via pyprophet include: transition_group_id: Incrementing ID of the transition in the MS(.pqp) library used for matching (I am pretty sure). decoy: Is this match of a decoy peptide? run_id: This is a bizarre encoding of the run, OpenMS/pyprophet re-encodes the run ID from the filename to a large signed integer. filename: Which raw mzXML file provides this particular intensity value? rt: Retention time in seconds for the matching peak group. assay_rt: The expected retention time after normalization with the iRT. (how does the iRT change this value?) delta_rt: The difference between rt and assay_rt irt: (As described in the abstract of Claudia Escher's 2012 paper: "Here we present iRT, an empirically derived dimensionless peptide-specific value that allows for highly accurate RT prediction. The iRT of a peptide is a fixed number relative to a standard set of reference iRT-peptides that can be transferred across laboratories and chromatographic systems.") assay_irt: The iRT observed in the actual chromatographic run. delta_irt: The difference. I am seeing that all the delta iRTs are in the -4000 range for our actual experiment; since this is in seconds, does that mean that it is ok as long as they stay in a similar range? id: unique long signed integer for the peak group. sequence: The sequence of the matched peptide fullunimodpeptidename: The sequence, but with unimod formatted modifications included. charge: The assumed charge of the observed peptide. mz: The m/z value of the precursor ion. intensity: The sum of all transition intensities in the peak group. aggr_prec_peak_area: Semi-colon separated list of intensities (peak areas) of the MS traces for this match. aggr_prec_peak_apex: Intensity peak apexes of the MS1 traces. leftwidth: The start of the peak group in seconds. rightwidth: The end of the peak group in seconds. peak_group_rank: When multiple peak groups match, which one is this? d_score: I think this is the score as retured by openMS (higher is better). m_score: I am pretty sure this is the result of a SELECT QVALUE operation in pyprophet. aggr_peak_area: The intensities of this fragment ion separated by semicolons. aggr_peak_apex: The intensities of this fragment ion separated by semicolons. aggr_fragment_annotation: Annotations of the fragment ion traces by semicolon. proteinname: Name of the matching protein. m_score_protein_run_specific: I am guessing the fdr for the pvalue for this run. mass: Mass of the observed fragment.

Value

List of data from each sample in the pyprophet scored DIA run.


elsayed-lab/hpgltools documentation built on May 9, 2024, 5:02 a.m.