knitr::opts_chunk$set(
  #collapse = TRUE,
  comment = "#>",
  fig.width = 4,
  fig.height = 4,
  message = FALSE,
  warning = FALSE,
  tidy.opts = list(
    keep.blank.line = TRUE,
    width.cutoff = 150
  ),
  options(width = 150),
  eval = TRUE
)

Introduction

In Vignette 6 we created a data frame called protPepNSA_AT5tmtMS2 that consists of all protein profiles, with each protein profile followed by its component peptide profiles. In this vignette, we shall first calculate RSA transformed profiles for all proteins and peptides, and then compute the constrained proportional assignments (CPA) for all proteins and peptides in a form ready for export. Then we show how to use it to plot profiles for any protein and its component peptides, with outlier peptides labelled in the plot.

RSA transformations

First, we attach the protlocassign package, which includes the protPepNSA_AT5tmtMS2 data frame. Note that for rows containing proteins, the peptide column contains just the protein name while for rows containing peptides, the peptide column contains a concatenated protein and peptide sequence. As in previous vignettes, we rename the embedded data frames to remove experiment specific designations (e.g., AT5tmtMS2) for ease of presentation.

library(protlocassign)
data(protPepNSA_AT5tmtMS2)
data(totProtAT5)
protPepNSA <- protPepNSA_AT5tmtMS2
str(protPepNSA, strict.width="cut", width=65)
totProt <- totProtAT5
totProt

Next, we extract the NSA reference profiles from the nine profile columns of protPepNSA:

data(markerListJadot)
refLocationProfilesNSA <- locationProfileSetup(profile=protPepNSA[, 4 + (1:9)],
                          markerList=markerListJadot, numDataCols=9)
round(refLocationProfilesNSA, digits=4)

Using the RSAfromNSA function described previously in Vignette 3, we calculate the RSA-transformed marker profiles:

refLocationProfilesRSA <- RSAfromNSA(NSA=refLocationProfilesNSA,
                              NstartMaterialFractions=6, totProt=totProtAT5)
round(refLocationProfilesRSA, digits=4)

We transform the protein/peptide profiles by taking the nine columns containing the profile data from protPepNSA and then, using the RSAfromNSA function described previously in Vignette 3, we calculate an intermediate nine-column data frame protPepRSA_trimmed of RSA-transformed profiles.

protPepRSA_trimmed <- RSAfromNSA(NSA=protPepNSA[, 4 + (1:9)],
                              NstartMaterialFractions=6, totProt=totProtAT5)
str(protPepRSA_trimmed, strict.width="cut", width=65)

Finally, we add the five reference columns back in as the first columns of protPepRSA and also the two columns listing the number of spectra and peptides per protein. The resulting data frame protPepRSA has the same structure as the original data frame protPepNSA.

protPepRSA <- data.frame(protPepNSA[, 1:4], protPepRSA_trimmed, protPepNSA[,14:15] )  # add in the ref columns
str(protPepRSA, strict.width="cut", width=65)

Plotting RSA protein and peptide profiles

Next, we identify rows with proteins only, and extract them. The resulting data frame, protRSA, parallels the structure of protNSA. We also extract the rows with peptides only in the data frame pepRSA.

protRSA.ind <- {protPepRSA$prot == protPepRSA$peptide}  # protein indicators
protRSA <- protPepRSA[protRSA.ind,]  # these are the data for proteins only
dim(protRSA)
pepRSA <- protPepRSA[!protRSA.ind,] # these are the data for peptides only

data.frame(colnames(protRSA))

Now we calculate the constrained proportional assignments on proteins only, using RSA-transformed profiles:

protCPAfromRSA <-  fitCPA(profile=protRSA[, 4+1:9],
                      refLocationProfiles=refLocationProfilesRSA, 
                      numDataCols=9)
str(protCPAfromRSA, strict.width="cut", width=65)

The following commands generate a plot of TLN1 protein/peptides, with CPA estimates. Outlier peptide profiles are in orange. The header reports the number of peptides and spectra used to compute the protein profile, which in this case excludes outlier peptides and outlier spectra.

#windows(width=7.5, height=10)  # open a window 7.5 by 10 inches
protPepPlotfun(protName="TLN1", protProfile=protRSA[,5:15],
               Nspectra=TRUE, pepProfile=pepRSA, numRefCols=4,
               numDataCols=9, n.compartments=8, 
               refLocationProfiles=refLocationProfilesRSA,
               assignPropsMat=protCPAfromRSA, 
               yAxisLabel="Relative Specific Amount")

Note that the outlier peptides do not contribute to the CPA analysis of the proteins, but these may be of interest. For instance, they may represent protein isoforms with distinct distributions. Thus, there may be specific biological questions that require CPA estimates for all proteins and peptides without outlier removal. This can be accomplished using the following command:

protPepCPAfromRSA <- fitCPA(profile=protPepRSA[,4 + 1:9],
                               refLocationProfiles=refLocationProfilesRSA, numDataCols=9)
str(protPepCPAfromRSA, strict.width="cut", width=65)

We next assemble the final CPA values for the protein/peptide data along with ancillary information, ready for export. Then we output the data to C:\temp\myProteinOutput; users will select their own directory.

protPepCPAfromRSAout <- data.frame(protPepRSA[,1:4], protPepCPAfromRSA, protPepRSA[,14:15])
protPepCPAfromRSAout$prot <- paste("`", protPepCPAfromRSAout$prot, sep="")
protPepCPAfromRSAout$peptide <- paste("`", protPepCPAfromRSAout$peptide, sep="")

setwd("C:\\temp\\myProteinOutput")
write.csv(protPepCPAfromRSAout, file="protPepCPAfromRSAout.csv", row.names=FALSE, na=".") 

To output plots of all of the protein and peptide profiles into a single pdf file, we first use setwd to point to the desired output directory, and then we can set up a loop as follows:

setwd("C:\\temp\\myProteinOutput")
pdf(file="allProtPepPlotsRSA.pdf", width=7, height=10)
n.prots <- nrow(protRSA)
for (i in 1:n.prots) {
   protPepPlotfun(protName=protRSA$prot[i],
       protProfile=protRSA[,5:15], 
       Nspectra=TRUE, pepProfile=pepRSA, numRefCols=4, 
       numDataCols=9, n.compartments=8, 
       refLocationProfiles=refLocationProfilesRSA,
       assignPropsMat=protCPAfromRSA, 
       yAxisLabel="Relative Specific Amount")
}
dev.off()


mooredf22/protlocassign0p1p1 documentation built on Feb. 7, 2022, 1:55 a.m.