library(recountmethylationManuscriptSupplement)
pkgname <- "recountmethylationManuscriptSupplement"
scripts.dir <- system.file("scripts", "figures", 
                           package = pkgname)

library(ggplot2)
library(gridExtra)
library(data.table)

knitr::opts_chunk$set(echo = TRUE, eval = FALSE)

Figure S1

Obtain sample (GSM IDs) and study (GSE IDs) summaries from queries to GEO with queries to Entrez utilities. With Entrez utilities installed, navigate to the directory with the script "eqplot.py" and run:

python3 eqplot.py

This will cause data queries to the GEO servers, and result in the creation of several new files, "gsmyeardata", "gsmidatyrdata", "gseyeardata", and "gseidatyrdata". Finally, run the script "fig1a.R" to make the plot.

script.name <- "figS1.R"
source(file.path(scripts.dir, script.name))
figS1

Figure S2

This section shows how to generate stacked barplots of minimum confidences and most likely sample type predictions from running MetaSRA-pipeline on GEO samples. The data is extracted from the main metadata table table-s1_mdpost_all-gsm-md.csv (Table S1).

script.name <- "figS2.R"
source(file.path(scripts.dir, script.name))
figS2

Figure S3

Source the script. This load BeadArray control signals across all samples, filters on controls of interest, and makes a composite plots of signal distributions with vertical blue lines showing the minimum threshold for a sample to "pass" the control. Thresholds are based on Illumina's documentation and the ewastools package (Methods). Metrics with "trimmed" in the title have had extreme outlying samples with very high signal removed before plotting.

script.name <- "figS3.R"
source(file.path(scripts.dir, script.name))

Figure S4

script.name <- "figS4.R"
source(file.path(scripts.dir, script.name))
figS4

Figure S5

script.name <- "figS5.R"
source(file.path(scripts.dir, script.name))

Figure S6

Details including sample identification, analyses, and results summaries are detailed in the script pca_fig3.R. This script generates the files contained in r system.file("extdata", "pcadata", package = pkgname), which are used below to generate the main plots.

script.name <- "figS6ac.R"
source(file.path(scripts.dir, script.name))
script.name <- "figS6bd.R"
source(file.path(scripts.dir, script.name))
figS6a
figS6b
figS6c
figS6d

Figure S7

This section shows how to generate the data summary plots for DNAm variability analyses in 7 noncancer tissues. Source the script to generate the barplots summarizing studies and samples (fig7b) and the anova filter summary stacked barplot.

script.name <- "figS7.R"
source(file.path(scripts.dir, script.name))
figS7b.studies
figS7b.samples
figS7c

Figure S8

This section shows how to generate a composite of stacked barplots summarizing genome mapping patterns in probes with low variance across 7 noncancer tissues.

script.name <- "figS8.R"
source(file.path(scripts.dir, script.name))
grid.arrange(figS8.plot1, figS8.plot2, figS8.plot3, ncol = 3)


metamaden/recountmethylationManuscriptSupplement documentation built on May 31, 2021, 5:27 a.m.