Introduction

The Pepliner packages is a set of tools and a shiny app for visualizing the elution of peptides or proteins across fractions or experiments. In a fractionation mass spectrometry experiment, a protein sample is separated across a biochemical gradient into multiple fractions. Fractions are portions of the original sample in order of size or charge depending on the biochemical column used. Applying mass spectrometry proteomics give the identities of the set of proteins in each fraction. The point at which a protein elutes into a fraction gives information about size or charge. If the protein lysate is native (low soap), proteins that are in the same stable protein complex with elute at the same points. While we typically look at protein-level patterns, where the observations of the peptides are summed, we additional find interesting information from looking at the elutions of peptides from the same protein.

Functions from the Pepliner package can either be used to process data from fractionation experiments, or the Pepliner Shiny application can be used to interactively plot and format protein-level or peptide-level elutions for a fractional experiment or a set of fractionation experiments.


Installation

devtools::install_github("marcottelab/pepliner")

Main pepliner functions

df <- read_csv("data/Hs_CB660_1105_peptide_elution_human_protein_minimal.csv")

df %>% group_by(ExperimentID) %>%
       complete(PeptideCount, nesting(Peptide, FractionID)) %>% arrange(PeptideCount)



prev <- df %>% split(.$ExperimentID) %>%
     map( ~ complete_counts(raw_data = ., 'FractionID','PeptideCount')) %>%
     bind_rows(.id = "what")


df_tidyr <- df %>% split(.$ExperimentID) %>%
     map( ~ complete(raw_data = ., nesting('FractionID','PeptideCount'))) %>%
     bind_rows(.id = "what")

tmp <- df %>% tidyr::spread(FractionID, PeptideCount) 


gather(FractionID, PeptideCount, names(.)[!names(.) %in% colnames(df)[!colnames(df) %in% c("FractionID","PeptideCount")]])

names(.)[!names(.) %in% colnames(raw_data)[!colnames(raw_data) %in% c("FractionID","PeptideCount")]]


df %>% group_by(ExperimentID) %>%
       spread(FractionID, PeptideCount) %>% 
       gather(FractionID, PeptideCount, unique(df$FractionID))




tmp %>% gather(FractionID, PeptideCount, unique(df$FractionID))



tmp %>% unique(df$FractionID)


function(raw_data,xaxis,yaxis){

    #spread and gather leaving every other column than xaxis and yaxis intact.
    data_spread <- raw_data %>% tidyr::spread_(xaxis, yaxis)
    post_data <- data_spread %>% tidyr::gather_(xaxis, yaxis, names(.)[!names(.) %in% colnames(raw_data)[!colnames(raw_data) %in% c(xaxis,yaxis)]])

Using pepliner as an interactive application to visualize peptide or protein elutions

First load the pepliner library, then launch app with

library(pepliner)
shiny::runApp(launch.browser = TRUE)

The app comes preloaded with an example dataset which can be used to preview app functions.

Interactively plot a set of proteins (protein-level visualization)

Protein data

Protein data in wide format

| ID | Fraction1 | Fraction2 | Fraction3 | |---|---|---|---| | ProteinA | 0 | 1 | 3 | | ProteinB | 5 | 25 | 10 |

Loading wide format protein data will automatically reformat it into the tidy format, which can then be downloaded from the Input Data screen.

Protein data in tidy format

| ID | FractionID | ProteinCount | |---|---|---| | ProteinA | Fraction1 | 0 | | ProteinA | Fraction2 | 1 | | ProteinA | Fraction3 | 3 | | ProteinB | Fraction1 | 5 | | ProteinB | Fraction2 | 25 | | ProteinB | Fraction3 | 10 |

The benefit of the tidy format is that you can add two optional additional columns of information, ExperimentID and condition

| ID | FractionID | ProteinCount | ExperimentID | condition | |---|---|---|---|---| | ProteinA | Fraction1 | 0 | Experiment1 | heavy | | ProteinA | Fraction2 | 1 | Experiment1 | light | | ProteinA | Fraction3 | 3 | Experiment1 | heavy | | ProteinA | Fraction1 | 5 | Experiment2 | heavy | | ProteinA | Fraction2 | 25 | Experiment2 | light | | ProteinB | Fraction3 | 10 | Experiment1 | light |

The ExperimentID columns allows you to facet, or make one plot per experiment. The condition column allows you to color peptide that come from different conditions, or facet the plot by condition.

View Protein elutions

From the Protein plots tab, you can start view protein elutions.

PUT A SCREENSHOT HERE.

Interactively plot peptides from a single protein (peptide-level visualization)

  1. Interactively select a series of proteins to plot 2.
  2. Explore the app's features with the example data set pre-loaded by clicking on the tabs above.
  3. Upload your data in the "Input Data" tab.

Data Format

Peptide data

Wide format

|Peptide| ID | Fraction1 | Fraction2 | Fraction3 | |---|---|---|---|---| |AAGTEPR| ProteinA | 0 | 1 | 1 | |CNMETLLK| ProteinA | 0 | 0 | 2 | |IPRELVK| ProteinB | 2 | 12 | 2 | |MITFPELLR| ProteinB | 3 | 13 | 8 |

Tidy format

|Peptide| ID | FractionID | ProteinCount | |---|---|---|---| |AAGTEPR| ProteinA | Fraction2 | 1 | |AAGTEPR| ProteinA | Fraction3 | 1 | |CNMETLLK| ProteinA | Fraction3 | 2 | |IPRELVK| ProteinB | Fraction1 | 2 | |IPRELVK| ProteinB | Fraction2 | 12 | |IPRELVK| ProteinB | Fraction3 | 2 | |MITFPELLR| ProteinB | Fraction3 | 3 | |MITFPELLR| ProteinB | Fraction3 | 13 | |MITFPELLR| ProteinB | Fraction3 | 8 |

TIP: Save Data for Future Upload

After uploading your data to the pepliner app, click the red button 'Save results at Pepliner Rdata file' to download an .RData file to create a session file to load in future sessions. In the "Input Data" tab, use the "Pepliner RData file" option.

More Help and Info

Additional help information and more detailed instructions are provided under the "Instructions" tab.

App Info

Provenance

The source code for pepliner is available on Github.

A paper describing use of peptide elutions to identify proteoforms will be posted on BioRxiv.

Pepliner was modified from the START app developed by Jessica Minnier, Jiri Sklenar, Anthony Paul Barnes, and Jonathan Nelson of Oregon Health & Science University, Knight Cardiovascular Institute and School of Public Health.

Nelson, JW, Sklenar J, Barnes AP, Minnier J. (2016) "The START App: A Web-Based RNAseq Analysis and Visualization Resource." Bioinformatics. doi: 10.1093/bioinformatics/btw624.

The original source code of START is available on Github.



marcottelab/pepliner documentation built on May 21, 2019, 8:05 a.m.