library(recountmethylationManuscriptSupplement) pkgname <- "recountmethylationManuscriptSupplement" scripts.dir <- system.file("scripts", "figures", package = pkgname) library(ggplot2) library(gridExtra) library(data.table) knitr::opts_chunk$set(echo = TRUE, eval = FALSE)
Some details about the platforms of interest are as follows:
lid <- list(c('GPL13534','HM450K','2011','May 2011'), c('GPL21145', 'EPIC','2015','Nov. 2015'), c('GPL8490', 'HM27K','2009','Apr. 2009') ) df <- as.data.frame(do.call(rbind, lid)) colnames(df) <- c("platform_id", "platform_name", "release_year", "release_month") knitr::kable(df)
Obtain sample (GSM IDs) and study (GSE IDs) summaries from queries to GEO with queries to Entrez utilities. With Entrez utilities installed, navigate to the directory with the script "eqplot.py" and run:
python3 eqplot.py
This will cause data queries to the GEO servers, and result in the creation of several new files, "gsmyeardata", "gsmidatyrdata", "gseyeardata", and "gseidatyrdata". Finally, run the script "fig1a.R" to make the plot.
script.name <- "fig1a.R" source(file.path(scripts.dir, script.name)) fig1a
See data analyses vignette in recountmethylation
Source the script "fig2a.R". This creates control outcomes from BeadArray signals in Table S2, then generates the plot object.
script.name <- "fig2a.R" source(file.path(scripts.dir, script.name)) fig2a
See data analyses vignette in recountmethylation
script.name <- "fig2c.R" source(file.path(scripts.dir, script.name)) fig2c
script.name <- "fig2d.R" source(file.path(scripts.dir, script.name)) draw(hmlist, row_title = paste0("Study ID (N = ", length(gsesize.subset),")"), column_title = "Sub-threshold Frequency", heatmap_legend_side = "right", annotation_legend_side = "right")
Details including sample identification, analyses, and results summaries are detailed in
the script pca_fig3.R
. This script generates the files contained in
r system.file("extdata", "pcadata", package = pkgname)
, which are used below to
generate the main plots.
Run the following to generate the scatterplot of the top two components from PCA of all samples. Red points are noncancer blood, purple points are leukemia samples.
script.name <- "fig3a.R" source(file.path(scripts.dir, script.name)) fig3a
Run the following to generate the scatterplot of the top two components from PCA of all samples except blood or leukemias. Blue points are noncancer brain, dark cyan points are brain tumors.
script.name <- "fig3b.R" source(file.path(scripts.dir, script.name)) fig3b
Run the following to generate the scatterplot of the top two components from PCA of 7 noncancer tissues: sperm (blue); adipose (dark red); blood (red); brain (purple); buccal (orange); nasal (light green); and liver (dark green).
script.name <- "fig3c.R" source(file.path(scripts.dir, script.name)) fig3c
Run the following to generate the scatterplot of the top two components from PCA of 6 noncancer tissues: adipose (dark red); blood (red); brain (purple); buccal (orange); nasal (light green); and liver (dark green).
script.name <- "fig3d.R" source(file.path(scripts.dir, script.name)) fig3d
For analysis details, see also the data analyses vignette in recountmethylation
Violin plots of 14,000 high variance probes
script.name <- "fig4ab.R" source(file.path(scripts.dir, script.name)) fig4a
fig4b
This section shows how to generate the composite image of stacked barplots showing probe genome region mappings among the 14,000 probes with high tissue-specific variances in 7 noncancer tissues.
script.name <- "fig4c.R" suppressMessages(source(file.path(scripts.dir, script.name)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.