This vignette from the R package JMDplots version r packageDescription("JMDplots")$Version
approximately reproduces calculations of compositional oxidation state and hydration state and thermodynamic potential that are described in two papers published in PeerJ (Dick, 2016 and Dick, 2017).
The reproduction is not exact because of data updates made in the package after the papers were published.
The sections below give the Abbreviations used here,
a Summary Table comparing the chemical compositions of groups of human proteins that are relatively down- and up-expressed (n1
and n2
, respectively) in colorectal cancer (CRC),
the Data Sources for protein expression in CRC,
a plot of the Mean Differences of average oxidation state of carbon (Zc
) and water demand per residue (nH2O
),
and References.
T (tumor), N (normal), C (carcinoma or adenocarcinoma), A (adenoma), CM (conditioned media), AD (adenomatous colon polyps), CIS (carcinoma in situ), ICC (invasive colonic carcinoma), NC (non-neoplastic colonic mucosa).
options(width = 90)
library(canprot) library(CHNOSZ)
Identify the datasets for protein expression.
datasets <- pdat_colorectal(2017)
Get the amino acid compositions of the up- and down-expressed proteins (pdat_colorectal
) and make comparisons of indicators of oxidation and hydration state (get_comptab
).
pdat <- lapply(datasets, pdat_colorectal) comptab <- lapply(pdat, get_comptab, plot.it = FALSE, mfun = "mean", oldstyle = TRUE)
Generate the HTML table with xsummary
, which adds bold and underline formatting to the output of xtable
.
The columns show the difference in means (DM
), common language effect size (ES
), and p-value (p
) for comparisons between groups of the average oxidation state of carbon (Zc
) and water demand per residue (nH2O
).
library(xtable) out <- xsummary(comptab) # round values and include dataset tags tags <- sapply(sapply(strsplit(datasets, "="), "[", -1), paste, collapse = ";") out <- cbind(out[, 1:2], tags = tags, out[, 3:14]) out[, 6:15] <- round(out[, 6:15], 4) write.csv(out, "colorectal.csv", row.names = FALSE, quote = 2)
a. @WTK+08 used 2-nitrobenzenesulfenyl labeling and MS/MS analysis to identify 128 proteins with differential expression in paired CRC and normal tissue specimens from 12 patients.
The list of proteins used in this study was generated by combining the lists of up- and down-regulated proteins from Table 1 and Supplementary Data 1 of @WTK+08 with the Swiss-Prot and UniProt accession numbers from their Supplementary Data 2.
b. c. d. @AKP+10 used nano-LC-MS/MS to characterize proteins from the nuclear matrix fraction in samples from 2 patients each with adenoma (ADE), chromosomal instability CRC (CIN+) and microsatellite instability CRC (MIN+).
Cluster analysis was used to classify proteins with differential expression between ADE and CIN+, MIN+, or in both subtypes of carcinoma (CRC).
Here, gene names from Supplementary Tables 5--7 of @AKP+10 were converted to UniProt IDs using the UniProt mapping tool.
e. @JKMF10 compiled a list of candidate serum biomarkers from a meta-analysis of the literature.
In the meta-analysis, 99 up- or down-expressed proteins were identified in at least 2 studies.
The list of UniProt IDs used in this study was taken from Table 4 of @JKMF10.
f. g. @XZC+10 used a gel-enhanced LC-MS method to analyze proteins in pooled tissue samples from 13 stage I and 24 stage II CRC patients and pooled normal colonic tissues from the same patients.
Here, IPI accession numbers from Supplemental Table 4 of @XZC+10 were converted to UniProt IDs using the DAVID conversion tool.
h. @ZYS+10 used acetylation stable isotope labeling and LTQ-FT MS to analyze proteins in pooled microdissected epithelial samples of tumor and normal mucosa from 20 patients, finding 67 and 70 proteins with increased or decreased expression (ratios ≥2 or ≤0.5).
Here, IPI accession numbers from Supplemental Table 4 of @ZYS+10 were converted to UniProt IDs using the DAVID conversion tool.
i. j. k. l. m. @BPV+11 analyzed microdissected cancer and normal tissues from 28 patients (4 adenoma samples and 24 CRC samples at different stages) using iTRAQ labeling and MALDI-TOF/TOF MS to identify 555 proteins with differential expression between adenoma and stage I, II, III, IV CRC.
Here, gene names from supplemental Table 9 of @BPV+11 were converted to UniProt IDs using the UniProt mapping tool.
n. @JCF+11 analyzed paired samples from 16 patients using iTRAQ-MS to identify 118 proteins with >1.3-fold differential expression between CRC tumors and adjacent normal mucosa.
The protein list used in this study was taken from Supplementary Table 2 of @JCF+11.
o. p. q. @MRK+11 used iTRAQ labeling with LC-MS/MS to identify a total of 1061 proteins with differential expression (fold change ≥1.5 and false discovery rate ≤0.01) between pooled samples of 4 normal colon (NC), 12 tubular or tubulo-villous adenoma (AD) and 5 adenocarcinoma (AC) tissues.
The list of proteins used in this study was taken from from Table S8 of @MRK+11.
r. @KKL+12 used difference in-gel electrophoresis (DIGE) and cleavable isotope-coded affinity tag (cICAT) labeling followed by mass spectrometry to identify 175 proteins with more than 2-fold abundance ratios between microdissected and pooled tumor tissues from stage-IV CRC patients with good outcomes (survived more than five years; 3 patients) and poor outcomes (died within 25 months; 3 patients).
The protein list used in this study was made by filtering the cICAT data from Supplementary Table 5 of @KKL+12 with an abundance ratio cutoff of >2 or <0.5, giving 147 proteins. IPI accession numbers were converted to UniProt IDs using the DAVID conversion tool.
s. @KYK+12 used mTRAQ and cICAT analysis of pooled microsatellite stable (MSS-type) CRC tissues and pooled matched normal tissues from 3 patients to identify 1009 and 478 proteins in cancer tissue with increased or decreased expression by higher than 2-fold, respectively.
Here, the list of proteins from Supplementary Table 4 of @KYK+12 was filtered to include proteins with expression ratio >2 or <0.5 in both mTRAQ and cICAT analyses, leaving 175 up-expressed and 248 down-expressed proteins in CRC.
Gene names were converted to UniProt IDs using the UniProt mapping tool.
t. @WOD+12 used LC-MS/MS to analyze proteins in microdissected samples of formalin-fixed paraffin-embedded (FFPE) tissue from 8 patients; at P < 0.01, 762 proteins had differential expression between normal mucosa and primary tumors.
The list of proteins used in this study was taken from Supplementary Table 4 of @WOD+12.
u. @YLZ+12 analyzed the conditioned media of paired stage I or IIA CRC and normal tissues from 9 patients using lectin affinity capture for glycoprotein (secreted protein) enrichment by nano LC-MS/MS to identify 68 up-regulated and 55 down-regulated differentially expressed proteins.
IPI accession numbers listed in Supplementary Table 2 of @YLZ+12 were converted to UniProt IDs using the DAVID conversion tool.
v. @MCZ+13 used laser capture microdissection (LCM) to separate stromal cells from 8 colon adenocarcinoma and 8 non-neoplastic tissue samples, which were pooled and analyzed by iTRAQ to identify 70 differentially expressed proteins.
Here, gi numbers listed in Table 1 of @MCZ+13 were converted to UniProt IDs using the UniProt mapping tool; FASTA sequences of 31 proteins not found in UniProt were downloaded from NCBI and amino acid compositions were added to human.extra.csv
in the canprot package.
w. @KWA+14 used differential biochemical extraction to isolate the chromatin-binding fraction in frozen samples of colon adenomas (3 patients) and carcinomas (5 patients), and LC-MS/MS was used for protein identification and label-free quantification.
The results were combined with a database search to generate a list of 106 proteins with nuclear annotations and at least a three-fold expression difference.
Here, gene names from Table 2 of @KWA+14 were converted to UniProt IDs.
x. @UNS+14 analyzed 30 samples of colorectal adenomas and paired normal mucosa using iTRAQ labeling, OFFGEL electrophoresis and LC-MS/MS.
111 proteins with expression fold changes (log~2~) at least +/- 0.5 and statistical significance threshold q < 0.02 that were also quantified in cell-line experiments were classified as “epithelial cell signature proteins”.
UniProt IDs were taken from Table III of @UNS+14.
y. @WKP+14 analyzed the secretome of paired CRC and normal tissue from 4 patients, adopting a five-fold enrichment cutoff for identification of candidate biomarkers.
Here, the list of proteins from Supplementary Table 1 of @WKP+14 was filtered to include those with at least five-fold greater or lower abundance in CRC samples and p < 0.05.
Two proteins listed as “Unmapped by Ingenuity” were removed, and gene names were converted to UniProt IDs using the UniProt mapping tool.
z. @STK+15 analyzed the membrane-enriched proteome from tumor and adjacent normal tissues from 8 patients using label-free nano-LC-MS/MS to identify 184 proteins with a fold change > 1.5 and p-value < 0.05.
Here, protein identifiers from Supporting Table 2 of @STK+15 were used to find the corresponding UniProt IDs.
A. B. C. @WDO+15 analyzed 8 matched formalin-fixed and paraffin-embedded (FFPE) samples of normal tissue (N) and adenocarcinoma (C) and 16 nonmatched adenoma samples (A) using LC-MS to identify 2300 (N/A), 1780 (A/C) and 2161 (N/C) up- or down-regulated proteins at p < 0.05.
The list of proteins used in this study includes only those marked as having a significant change in SI Table 3 of @WDO+15.
D. E. F. @LPL+16 used iTRAQ and 2D LC-MS/MS to analyze pooled samples of stroma purified by laser capture microdissection (LCM) from 5 cases of non-neoplastic colonic mucosa (NC), 8 of adenomatous colon polyps (AD), 5 of colon carcinoma in situ (CIS) and 9 of invasive colonic carcinoma (ICC).
A total of 222 differentially expressed proteins between NC and other stages were identified.
Here, gene symbols from Supplementary Table S3 of @LPL+16 were converted to UniProt IDs using the UniProt mapping tool.
G. Data were extracted from SI Table S3 of @LXM+16, including proteins with p-value <0.05.
H. I. J. @PHL+16 used iTRAQ 2D LC-MS/MS to analyze pooled samples from 5 cases of normal colonic mucosa (NC), 8 of adenoma (AD), 5 of carcinoma in situ (CIS) and 9 of invasive colorectal cancer (ICC).
A total of 326 proteins with differential expression between two successive stages (and, for CIS and ICC, also differentially expressed with respect to NC) were detected.
The list of proteins used in this study was generated by converting the gene names in Supplementary Table 4 of @PHL+16 to UniProt IDs using the UniProt mapping tool.
Using data from the table above, this plot shows that the groups of proteins that are relatively up-expressed in colorectal cancer or more advanced cancer stages predominantly have higher Zc
and/or nH2O
.
The datasets comparing adenoma to normal proteomes are highlighted in red.
col <- rep("black", length(datasets)) # highlight adenoma / normal datasets col[grepl("=AD", datasets)] <- "red" diffplot(comptab, col = col, oldstyle = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.