In jedick/JMDplots: Plots from Papers by Jeffrey M. Dick

library(knitr)
## use pngquant to reduce size of PNG images
knit_hooks$set(pngquant = hook_pngquant)
pngquant <- "--speed=1 --quality=0-25"
# in case pngquant isn't available
if (!nzchar(Sys.which("pngquant"))) pngquant <- NULL 

## colorize messages 20171031
## adapted from https://gist.github.com/yihui/2629886#file-knitr-color-msg-rnw
color_block = function(color) {
  function(x, options) sprintf('<pre style="color:%s">%s</pre>', color, x)
}
knit_hooks$set(warning = color_block('magenta'), error = color_block('red'), message = color_block('blue'))

options(width = 80)

# https://stackoverflow.com/questions/595365/how-to-render-narrow-non-breaking-spaces-in-html-for-windows
logfO2 <- "log&#x202F;<i>f</i>O<sub>2</sub>"
logaH2O <- "log&#x202F;<i>a</i>H<sub>2</sub>O"
nH2O <- "<i>n</i>H<sub>2</sub>O"
Zc <- "<i>Z</i><sub>C</sub>"

This vignette runs the code to make the plots from the following paper first published by Springer Nature:

Dick JM, Tan J. 2023. Chemical links between redox conditions and estimated community proteomes from 16S rRNA and reference protein sequences. Microbial Ecology 85(4): 1338--1355. doi: 10.1007/s00248-022-01988-9

Use this link for full-text access to a view-only version of the paper: https://rdcu.be/cMCDa. A preprint of the paper is available on bioRxiv at doi: 10.1101/2021.05.31.446500.

This vignette was compiled on r Sys.Date() with JMDplots r packageDescription("JMDplots")$Version and chem16S r packageDescription("chem16S")$Version.

library(JMDplots)

Distinct chemical parameters of reference proteomes for major taxonomic groups (Figure 1)

Table_S5 <- geo16S1()

Data source: NCBI Reference Sequence (RefSeq) database [@OWB+16]. Numbered symbols: (1) Methanococci, (2) Archaeoglobi, (3) Thermococci, (4) Halobacteria, (5) Clostridia.

Specific values mentioned in the text

r Zc for reference proteomes of genera that are abundant in produced fluids of shale gas wells:

datadir <- system.file("RefDB/RefSeq_206", package = "JMDplots")
taxon_metrics <- read.csv(file.path(datadir, "taxon_metrics.csv.xz"), as.is = TRUE)
subset(taxon_metrics, group %in% c("Halanaerobium", "Thermoanaerobacter"))

r Zc for reference proteomes of Halanaerobium species (numeric names are NCBI taxids):

datadir <- system.file("RefDB/RefSeq_206", package = "JMDplots")
refseq <- read.csv(file.path(datadir, "genome_AA.csv.xz"))
Zc.refseq <- Zc(refseq)
names(Zc.refseq) <- refseq$organism

names <- read.csv(file.path(datadir, "taxonomy.csv.xz"))
is.Halanaerobium <- names$genus %in% "Halanaerobium" & !is.na(names$species)
(Zc.Halanaerobium <- round(Zc.refseq[is.Halanaerobium], 3))
range(Zc.Halanaerobium)

Estimated community proteomes from different environments have distinct chemical signatures (Figure 2)

Table_S6 <- geo16S2()

Data sources: Guerrero Negro mat [@HCW+13], Yellowstone hot springs [@BGPF13], Baltic Sea water [@HLA+16], Lake Fryxell mat [@JHM+16], Tibetan Plateau lakes [@ZLM+16], Manus Basin vents [@MPB+17], Qarhan Salt Lake soils [@XDZ+17], Black Sea water [@SVH+19].

Lower carbon oxidation state is tied to oxygen depletion in water columns (Figure 3)

Table_S7 <- geo16S3()

Data sources: Black Sea [@SVH+19], Swiss lakes (Lake Zug and Lake Lugano) [@MZG+20], Eastern Tropical North Pacific (ETNP) [@GBL+15], Sansha Yongle Blue Hole [@HXZ+20], Ursu Lake [@BCA+21].

Common trends of carbon oxidation state of estimated community proteomes for shale gas wells and hydrothermal systems (Figure 4)

Table_S8 <- geo16S4()

Data sources: Northwestern Pennsylvania stream water and sediment [@UKD+18], Pennsylvania State Forests stream water in spring and fall [@MMA+20], Marcellus Shale [@CHM+14], Denver--Julesburg Basin [@HRR+18], Duvernay Formation [@ZLF+19].

Comparison of protein `r Zc` from metagenomic or metatranscriptomic data with estimates from 16S and reference sequences (Figure 5)

Table_S9 <- geo16S5()

Data sources: A. Guerrero Negro mat metagenome [@KRH+08], 16S [@HCW+13]; Bison Pool metagenome [@HRM+11], 16S [@SMS+12]; Eastern Tropical North Pacific metagenome [@GKG+15], metatranscriptome and 16S [@GBL+15]; Mono Lake metatranscriptome [@EH17], 16S [@EH18]. B. Marcellus Shale metagenome [@DBW+16], 16S [@CHM+14]. C. Manus Basin vents [@MPB+17], Black Sea metagenome [@VMW+21], 16S [@SVH+19]. D. Human Microbiome Project [@HMP12]. E. Soils [@FLA+12]; mammalian guts [@MKK+11].

RefSeq and 16S rRNA data processing outline (Figure S1)

geo16S_S1()

Scatterplots of `r Zc` and `r nH2O` for bacterial and archaeal genera vs higher taxonomic levels (Figure S2)

geo16S_S2()

`r nH2O`-`r Zc` plots for major phyla and their genera (Figure S3)

geo16S_S3()

Venn diagrams for phylum and genus names in the RefSeq (NCBI), RDP, and SILVA taxonomies (Figure S4)

Table_S10 <- geo16S_S4()

Data sources: RefSeq (NCBI): Names of taxa with protein sequences in RefSeq as listed in system.file("RefDB/RefSeq_206/taxonomy.csv.xz", package = "JMDplots"); RDP: trainset18_062020_speciesrank.fa in https://sourceforge.net/projects/rdp-classifier/files/RDP_Classifier_TrainingData/RDPClassifier_16S_trainsetNo18_rawtrainingdata.zip; SILVA: https://www.arb-silva.de/fileadmin/silva_databases/release_138_1/Exports/SILVA_138.1_SSURef_NR99_tax_silva.fasta.gz.

Correlations between `r Zc` estimated from metagenomes and 16S rRNA sequences (Figure S5)

geo16S_S5()

Correlation of `r Zc` with GC content of metagenomic and 16S amplicon reads (Figure S6)

geo16S_S6()

Data source: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR*******, where SRR******* is the SRA Run accession for metagenomic or 16S amplicon sequences.

Supplementary Table files

This code shows how the files for each of the Supplementary Tables is saved. The dat* objects are created by running the code blocks above, but the following code block is not run in this vignette in order to avoid cluttering the working directory.

write.csv(Table_S5, "Table_S5.csv", row.names = FALSE, quote = FALSE)
write.csv(Table_S6, "Table_S6.csv", row.names = FALSE, quote = FALSE)
write.csv(Table_S7, "Table_S7.csv", row.names = FALSE, quote = FALSE)
write.csv(Table_S8, "Table_S8.csv", row.names = FALSE, quote = FALSE)
write.csv(Table_S9, "Table_S9.csv", row.names = FALSE, quote = FALSE)
write.csv(Table_S10, "Table_S10.csv", row.names = FALSE, quote = FALSE)

References

jedick/JMDplots documentation built on April 12, 2025, 1:35 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jedick/JMDplots
Plots from Papers by Jeffrey M. Dick

In jedick/JMDplots: Plots from Papers by Jeffrey M. Dick

Distinct chemical parameters of reference proteomes for major taxonomic groups (Figure 1)

Specific values mentioned in the text

Estimated community proteomes from different environments have distinct chemical signatures (Figure 2)

Lower carbon oxidation state is tied to oxygen depletion in water columns (Figure 3)

Common trends of carbon oxidation state of estimated community proteomes for shale gas wells and hydrothermal systems (Figure 4)

Comparison of protein `r Zc` from metagenomic or metatranscriptomic data with estimates from 16S and reference sequences (Figure 5)

RefSeq and 16S rRNA data processing outline (Figure S1)

Scatterplots of `r Zc` and `r nH2O` for bacterial and archaeal genera vs higher taxonomic levels (Figure S2)

`r nH2O`-`r Zc` plots for major phyla and their genera (Figure S3)

Venn diagrams for phylum and genus names in the RefSeq (NCBI), RDP, and SILVA taxonomies (Figure S4)

Correlations between `r Zc` estimated from metagenomes and 16S rRNA sequences (Figure S5)

Correlation of `r Zc` with GC content of metagenomic and 16S amplicon reads (Figure S6)

Supplementary Table files

References

R Package Documentation

Browse R Packages

We want your feedback!

jedick/JMDplots Plots from Papers by Jeffrey M. Dick

In jedick/JMDplots: Plots from Papers by Jeffrey M. Dick

Distinct chemical parameters of reference proteomes for major taxonomic groups (Figure 1)

Specific values mentioned in the text

Estimated community proteomes from different environments have distinct chemical signatures (Figure 2)

Lower carbon oxidation state is tied to oxygen depletion in water columns (Figure 3)

Common trends of carbon oxidation state of estimated community proteomes for shale gas wells and hydrothermal systems (Figure 4)

Comparison of protein r Zc from metagenomic or metatranscriptomic data with estimates from 16S and reference sequences (Figure 5)

RefSeq and 16S rRNA data processing outline (Figure S1)

Scatterplots of r Zc and r nH2O for bacterial and archaeal genera vs higher taxonomic levels (Figure S2)

r nH2O-r Zc plots for major phyla and their genera (Figure S3)

Venn diagrams for phylum and genus names in the RefSeq (NCBI), RDP, and SILVA taxonomies (Figure S4)

Correlations between r Zc estimated from metagenomes and 16S rRNA sequences (Figure S5)

Correlation of r Zc with GC content of metagenomic and 16S amplicon reads (Figure S6)

Supplementary Table files

References

R Package Documentation

Browse R Packages

We want your feedback!

jedick/JMDplots
Plots from Papers by Jeffrey M. Dick

Comparison of protein `r Zc` from metagenomic or metatranscriptomic data with estimates from 16S and reference sequences (Figure 5)

Scatterplots of `r Zc` and `r nH2O` for bacterial and archaeal genera vs higher taxonomic levels (Figure S2)

`r nH2O`-`r Zc` plots for major phyla and their genera (Figure S3)

Correlations between `r Zc` estimated from metagenomes and 16S rRNA sequences (Figure S5)

Correlation of `r Zc` with GC content of metagenomic and 16S amplicon reads (Figure S6)