\pagebreak

Abstract

PeCorA (peptide correlation analysis) is a package for detection of quantitative disagreements between peptides mapped to the same protein. This package provides an integrated analysis workflow for LFQ data that requires tabular input (e.g. peptide.txt file) as generated by quantitative analysis software of raw mass spectrometry data, such as MaxQuant [@Cox2014]. Functions are provided for data preparation, scaling, linear modeling and statistical testing of deferentially discordant peptides. It also includes tools for making a table of the peptides that disagree. Finally, visualization tools are provided to explore the results, including boxplot representations.

Installation and loading

Start R and install the PeCorA package:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("PeCorA")
library("PeCorA")

Once you have the package installed, load PeCorA into R.

library(PeCorA)

Converting MaxQuant peptide output into PeCorA format

Example dataset: COVID-19 plasma proteomics

We analyze a large-scale analysis of COVID19 severity. The data set was composed of over 100 plasma samples from three groups: (1) COVID-19-driven acute respiratory distress syndrome (ARDS) patients, (2) non-COVID-19-driven ARDS patients, and (3) pooled plasma control sample extracted with each batch as quality control [@Overmyer2020]. The raw mass spectrometry data were first analyzed using MaxQuant (version 1.6.10.43)[@Cox2014] and the resulting "peptides.txt" file is used as input for the downstream analysis. We filtered non-relevant information (e.g. amino acid count) and provide this dataset with PeCorA package

Loading of the data

data(peptides_data_filtered)

This dataset has the following dimensions:

dim(peptides_data_filtered)

The "Leading.razor.protein", "Sequence" and "LFQ.intensity" columns will be used for subsequent analysis. There are three biological conditions in this experiment,and the names captured in the LFQ.intensity variables will be used for comparisons

Data preparation

You can prepare MaxQuant peptide output into PeCorA-ready format using the function import_LFQ_PeCorA.

pecora_format <- import_LFQ_PeCorA(peptides_data_filtered,
                                   protein = 'Leading.razor.protein',
                                   sequence='Sequence',
                                   condition1='control',
                                   condition2='_COVID',
                                   condition3='NON.COVID')

Scaling and centering peptides

PeCorA_preprocessing initially filters the values to include only precursors with measured MS1 areas in all samples. Next, the peak areas are log2 transformed, and the global distribution of all peak areas was scaled to have the same center. Finally, each peptide is center relative to the mean of the control group’s peak area.

scaled_peptides <- PeCorA_preprocessing(pecora_format,
                                        area_column_name=4,
                                         threshold_to_filter=100,
                                         control_name="control")

Running PeCorA analysis

PeCorA loops through proteins with >2 peptides, and records a linear model on the peptide precursors for each of those protein recording a adjust p-value within each protein. It makes a dataframe with the peptides that disagree, sorting smaller adj_pval values at the top of table.

disagree_peptides <-  PeCorA(scaled_peptides)

Plotting PeCorA results

Example boxplot of a significant peptide detected in pro-thrombine in the COVID-19 plasma proteomics dataset as in [@Dermit2020].

PeCorA_plotting_plot<-PeCorA_plotting(disagree_peptides,
                                      disagree_peptides[12,],
                                      scaled_peptides)
PeCorA_plotting_plot

Session information

{r session_info, echo = FALSE} sessionInfo()

References {-}



demar01/PeCorA documentation built on Feb. 4, 2021, 8:44 p.m.