## some frequently used HTML expressions
# use lowercase here because these tend to be variable names in the examples
zc <- "<i>Z</i><sub>C</sub>"
o2 <- "O<sub>2</sub>"
h2o <- "H<sub>2</sub>O"
options(width = 90)

This vignette runs the code to make selected plots from the following papers:

Dick JM. 2016. Proteomic indicators of oxidation and hydration state in colorectal cancer. PeerJ 4: e2238. doi: 10.7717/peerj.2238

Dick JM. 2017. Chemical composition and the potential for proteomic transformation in cancer, hypoxia, and hyperosmotic stress. PeerJ 5: e3421. doi: 10.7717/peerj.3421

This vignette was compiled on r Sys.Date() with JMDplots r packageDescription("JMDplots")$Version and canprot r packageDescription("canprot")$Version.

library(JMDplots)

Microbial proteins in colorectal cancer (2016 Figure 4)

Stability fields represent the ranges of oxygen fugacity and water activity where a protein with the mean amino acid composition from the labeled microbial species has a higher per-residue affinity (lower Gibbs energy) of formation than the others. Blue and red shading designate microbes relatively enriched in samples from healthy donors and cancer patients, respectively. Plot (E) is a composite figure in which the intensity of shading corresponds to the number of overlapping healthy- or cancer-enriched microbes in the preceding diagrams


Data sources: A. @WCQ+12. B. @ZTV+14. C. @CTB+14. D. @FLJ+15.

Potential diagrams: Pancreatic cancer (2017 Figure S3 and 3E)

The potential diagrams show the weighted rank difference of chemical affinities between up- and down-expressed proteins in each dataset. Groups of datasets are considered that have similar chemical features, i.e. changes in r zc and nr h2o.

Here we make plots for datasets for pancreatic cancer having a mean difference of nr h2o that is > 0.01 and a small r zc, as judged by the p-value and common language effect size (CLES). Red and blue correspond to greater potential for formation of the up- and down-expressed proteins, respectively; the line of equipotential is shown in white:

gpresult <- groupplots("pancreatic_H2O_up", res = 25)

Now let's make a merged diagram. The red-white-blue shading is computed from the mean of the previous diagrams. The black lines show the median and quartiles for the y-positions of the equipotential lines in the previous diagrams. The second plot shows effective values of Eh (redox potential) as a function of the same variables (oxygen fugacity and water activity) (see 2016 Figure 6I).

par(mfrow = c(1, 2))
mergedplot(gpresult, res = 25)
Ehplot(xlim = c(-70, -62), ylim = c(-6, 2), dy = 0.1)

Data sources: @LHE+04, @MLC+11, @KHO+13, @KPC+13, @PKB+13, @KKC+16.

Basis species comparison (2017 Figure S1)

These plots show projections of elemental composition of proteins made using two sets of basis species. Using the CHNOS basis species (CO2, NH3, H2S, r h2o, r o2), the plots show that nr h2o and nr o2, i.e. the number of r h2o and r o2 in the formation per residue of the proteins from basis species, are both moderately correlated with r zc (average oxidation state of carbon). Using the QEC basis species (glutamine, glutamic acid, cysteine, r h2o, r o2), we find that nr o2 is strongly correlated with r zc, but nr h2o shows very little correlation. Accordingly, the QEC basis more clearly exposes two chemical variables -- oxidation state and hydration state -- in proteomic data.

Here we define some labels used in the plot.

QEClab <- CHNOSZ::syslab(c("glutamine", "glutamic acid", "cysteine", "H2O", "O2"))
CHNOSlab <- CHNOSZ::syslab(c("CO2", "NH3", "H2S", "H2O", "O2"))

Next, get the amino acid compositions of all proteins in the UniProt human proteome and calculate the protein formulas and r zc. Note that r zc is a sum of elemental ratios and is independent of the choice of basis species.

aa <- get("human.base", canprot)
protein.formula <- CHNOSZ::protein.formula(aa)
ZC <- CHNOSZ::ZC(protein.formula)

Now set up the figure and plot the per-residue elemental compositions of the proteins projected into different sets of basis species.

par(mfrow = c(2, 2))
par(mar = c(4, 4, 2.5, 1))
par(cex = 1.1)
par(mgp = c(2.5, 1, 0))
for(basis in c("QEC", "CHNOS")) {
  CHNOSZ::basis(basis)
  protein.basis <- CHNOSZ::protein.basis(aa)
  protein.length <- CHNOSZ::protein.length(aa)
  residue.basis <- protein.basis / protein.length
  smoothScatter(ZC, residue.basis[, "O2"], xlab = cplab$Zc, ylab = cplab$nO2)
  smoothScatter(ZC, residue.basis[, "H2O"], xlab = cplab$Zc, ylab = cplab$nH2O)
  if(basis=="QEC") mtext(QEClab, outer = TRUE, cex = 1.2, line = -1.5)
  if(basis=="CHNOS") mtext(CHNOSlab, outer = TRUE, cex = 1.2, line = -15)
}

Chemical analysis of differentially expressed proteins (2017 Figures 1 and 2)

Updates to these datasets and plots were made for a paper in 2021.

For individual vignettes including data references, see the files in system.file("extdata/cpcp", package = "JMDplots").


References



jedick/JMDplots documentation built on April 12, 2025, 1:35 p.m.