pqpq: PQPQ

Description Usage Arguments Details Value Examples

View source: R/pqpq.R

Description

This function applies the PQPQ algorithm to proteomic data to filter out poorly correlating peptides and cluster the remaining peptides into likely proteoforms.

Usage

1
2
3
4
5
6
pqpq(df, sample_names = NULL, protein_subset = NULL,
  data_type = c("Protein Pilot", "Spectrum Mill", "Proteome Discoverer",
  "Manually annotated"), normalize_data = TRUE, correlation_p_value = 0.4,
  high_confidence_limit = 95, peptide_sum_intensity_limit = 0,
  separate_multiple_protein_IDs = FALSE, manually_annotated_fields = NULL,
  action = c("mark", "filter"))

Arguments

df

The data frame with proteomic data.

sample_names

A character vector identifying the columns holding the quantitative peptide data. Optional IF (1) the data-type is not "manually annotated" and all columns are desired.

protein_subset

The subset of proteins on which to apply the filter.

data_type

One of "Protein Pilot", "Spectrum Mill", "Proteome Discoverer", or "Manually annotated" - Used to determine the columns containing different data elements.

normalize_data

Should the data be normalized? Default is TRUE for compatibility with the Matlab version.

correlation_p_value

Correlations with p-values below this threshold are determined to be significant.

high_confidence_limit

Minimum confidence for a peptide to be considered highly likley to be identified correctly.

peptide_sum_intensity_limit

Minimum intensity for peptide to be included.

separate_multiple_protein_IDs

If TRUE, peptides assigned to multiple proteins are copied and analyzed multiple times as part of each protein to which it is assigned. If FALSE, peptides are assigned to the protein group, and analyzed once. Default is FALSE.

manually_annotated_fields

If data_type is "Manually annotated", this is a list of the columns needed to complete the filtering: protein_id, confidence, and peptide_ids. See examples

Details

This function performs the PQPQ process - including preprocessing, peptide_selection, and filtering.

Value

A data frame identifying which peptides are kept, and which proteoforms they are assigned to.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
data("testdata2")

data("testdata2")
sample_names <- stringr::str_subset(names(testdata2), "^Area")[-7]
column_ids <- list(
  protein_id = "Accessions",
  confidence = 'Conf',
  peptide_ids = "Sequence"
)

result <- pqpq(testdata2, sample_names = sample_names, data_type = "Manually annotated", manually_annotated_fields = column_ids)

melissakey/PQPQ documentation built on May 4, 2019, 7:42 p.m.