In hirscheylab/tidybiology: Utility Functions and Data Sets for Tidy Biology

Batch save plots (Advanced; skip for now)

knitr::opts_chunk$set(
  echo = TRUE,
  results = 'show',
  fig.path = "OutputPlots/",
  dev = c('svg', 'png'),


  fig.width = 10,
  #In the unit of inches
  fig.height = 8,
  #In the unit of inches
  unit = "in"
)

Load data

library('scRNAseq')
scRNAData <- scRNAseq #Because ggplot2 builder needs the dataset to be in the Environment

Preview data structure

library('dplyr')

ggplot2's layering logic

Different connector

ggplot2 uses "+" rather than "%>%" to connect each command lines, or each layers. The different connector indicates a different logic.

In pipelines, commands or operations are run like water flowing inside a pipe.
In ggplot2, each layer of a plot is put together like staging in a theater.

Different layers of ggplots

library('ggplot2')

Aesthetics inheritance

Remember the order of geom objects matter if you have multiple. If geom_boxplot() is after geom_point(), the boxplot will cover up lots of the points.

If a geom object does not have its own aes(), then it will refer to the global aes().

three_organ_df <-
  scRNAData[scRNAData$transcript_length < 7500 &
              scRNAData$organ != 'other', ]

ggplot(three_organ_df) +
  aes(x = organ,
      y = transcript_length,
      color = "red") +

  geom_boxplot() + #Remember the order of geom objects matter if you have multiple. If geom_boxplot() is after geom_point(), the boxplot will cover up lots of the points
  geom_point() +

  labs(
    title = "Transcript length and GC content for genes in each organ",
    x = "Organ names",
    y = "Transcript length (bp)",
    color = 'GC content'
  ) +
  theme_minimal()

A geom object can have its own aes() for more specification. Below I have the red boxplots with black point plots.

Or I can have black boxplots and red point plots.

ggplot2 builder

Use ggplot2 builder for general data exploration, output codes, and provide more edits on the code.

Let's compare how cell 1's transcriptome is different from cell 2's.

Raw output

Edited with more details

It seems like the the difference in transcriptome of cell 1 and 2 are mainly in the genes expressed in the heart and kidney the most.

More power to ggplot2

viridis

ggplot(three_organ_df) + #Filtering the dataset on the run
  aes(x = organ,
      y = transcript_length,
      color = gene_percent_gc_content) +
  geom_point() +
  facet_wrap( ~ strand) +
  labs(
    title = "Transcript length and GC content for genes in each organ",
    x = "Organ names",
    y = "Transcript length (bp)",
    color = 'GC content'
  ) +
  theme_minimal()

ggsignif

ggplot(three_organ_df) +
  aes(x = organ,
      y = transcript_length) +

  geom_boxplot() + #Remember the order of geom objects matter if you have multiple. If geom_boxplot() is after geom_point(), the boxplot will cover up lots of the points


  labs(title = "Transcript length and GC content for genes in each organ",
       x = "Organ names",
       y = "Transcript length (bp)") +
  theme_minimal()

ggExtra

It comes with an addin called ggMarginal() for interactive data exploration.

You first create the major plot by ggplot2.

ggplot(three_organ_df) +
  aes(x = cell_2, y = cell_1) +

  geom_point(
    shape = "circle",
    size = 1.5,
    colour = 'blue',
    alpha = 0.3
  ) +
  geom_abline(intercept = 0,
              color = 'green') + #As a reference of how different



  labs(title = "Cell 1 compared to Cell 2",
       x = "Cell 1 transcriptome",
       y = "Cell 2 transcriptome") +
  theme_minimal() #I personally like to move the theme section to the end

Then you highlight the ggplot2 code chunk and use the addin, adjust the extra plots around the major plot, and output the code.

Save a favorite style to use later

Sometimes your PI might want to your graphs to look in a specific style that he or she likes. Instead of coping and pasting the theme codes everytime for every graph, you can save it in as a list and call it whenever you need it.

Original plot:

ggplot(three_organ_df) +
  aes(x = cell_2, y = cell_1) +

  geom_point(
    shape = "circle",
    size = 1.5,
    colour = 'blue',
    alpha = 0.3
  ) +
  geom_abline(intercept = 0,
              color = 'green') + #As a reference of how different


  facet_wrap(vars(organ), labeller = label_both) +
  labs(title = "Cell 1 compared to Cell 2",
       x = "Cell 1 transcriptome",
       y = "Cell 2 transcriptome") +
  theme_minimal()

What my PI likes:

ggplot(three_organ_df) +
  aes(x = cell_2, y = cell_1) +

  geom_point(
    shape = "circle",
    size = 1.5,
    colour = 'blue',
    alpha = 0.3
  ) +
  geom_abline(intercept = 0,
              color = 'green') + #As a reference of how different


  facet_wrap(vars(organ), labeller = label_both) +
  labs(title = "Cell 1 compared to Cell 2",
       x = "Cell 1 transcriptome",
       y = "Cell 2 transcriptome") +
  theme_linedraw() +


  #Below is the style that my PI specifically likes
  theme(
    #For all the fonts, the default font is "Arial" and the all the sizes are in the units of pt

    plot.title = element_text(size = 12, face = "bold"),
    plot.subtitle = element_text(size = 10, face = "bold", color = "black"),
    panel.border = element_rect(size = 1.5, color = "black"),

    axis.line = element_line(colour = '1.5', size = 1.5),
    #Jennifer likes the axis line to be of 1.5 pt thick
    axis.title = element_text(size = 10, face = "bold"),
    axis.text = element_text(size = 10, face = "bold"),

    strip.text.x = element_text(size = 10, face = "bold", color = "black"),
    strip.text.y = element_text(
      size = 10,
      face = "bold",
      color = "black",
      angle = 0
    ),
    strip.background = element_blank(),

    legend.title = element_text(size = 10, face = "bold"),
    legend.text = element_text(size = 9, face = "bold"),
    legend.key.width = unit(10, "pt") #Control the size of the legend unit. Decrease this if you have a very wide legend, which is unnecessary
  )

Save the specific styling in a list and call on it.

hirscheylab/tidybiology documentation built on May 20, 2022, 10:55 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

hirscheylab/tidybiology
Utility Functions and Data Sets for Tidy Biology

In hirscheylab/tidybiology: Utility Functions and Data Sets for Tidy Biology

Load data

Preview data structure

ggplot2's layering logic

Different connector

Different layers of ggplots

Aesthetics inheritance

ggplot2 builder

Raw output

Edited with more details

More power to ggplot2

viridis

ggsignif

ggExtra

Save a favorite style to use later

R Package Documentation

Browse R Packages

We want your feedback!

hirscheylab/tidybiology Utility Functions and Data Sets for Tidy Biology

In hirscheylab/tidybiology: Utility Functions and Data Sets for Tidy Biology

Load data

Preview data structure

ggplot2's layering logic

Different connector

Different layers of ggplots

Aesthetics inheritance

ggplot2 builder

Raw output

Edited with more details

More power to ggplot2

viridis

ggsignif

ggExtra

Save a favorite style to use later

R Package Documentation

Browse R Packages

We want your feedback!

hirscheylab/tidybiology
Utility Functions and Data Sets for Tidy Biology