library(learnr)
knitr::opts_chunk$set(echo = FALSE)
library(ggplot2)
library(ggpubr)
data("nutrimenu", package = "debuter")
couleurs <- c("steelblue", "limegreen", "darkolivegreen", "darkorchid")

The ggplot2 package allows you to easily create graphs with a unified syntax. In this tutorial, we are going to practice with the "nutrimenu" data.

Basic command

The basic command will look like this

ggplot(data, aes(x, y)) + geom_****()

Where

In this tutorial, we will see the following functions:

Function | Result ---------|--------- geom_bar, geom_col | Bar plot geom_point, geom_jitter | Scatterplot geom_line | Line plot geom_histogram | Histograms geom_boxplot | Boxplot geom_density | Density plot geom_violin | Violin plot

While doing the exercises, familiarize yourself with the documentation for ggplot2: https://ggplot2.tidyverse.org/

The nutrimenu data

The data in the nutrimenu table contains the nutritional data of r nrow nutrimenu) recipes. The table is shown below.

DT::datatable(nutrimenu)

The "Type" columns contains the recipe type.

For more information, go to Nutriwi (https://www.nutriwi.com/).

Represent a discrete variable

Complete the following command to get a bar chart of recipe types

ggplot( ) + geom_
ggplot(nutrimenu, aes(Type)) + geom_bar()

Change the colors and the theme of the graph by completing the next command.

ggplot( , aes(fill = )) + 
   geom_ + 
   theme_
ggplot(nutrimenu , aes(Type, fill = Type)) + 
   geom_bar() + 
   theme_bw()

Use the cut function as follows to transform the Proteines (continuous) variable into a qualitative variable.

cut(nutrimenu$Proteines, c(-1, 6, 14))
# There is nothing more to do: just run the command and observe its result
cut(nutrimenu$Proteines, c(-1, 6, 14))
# If you want, you can use the table function on this result
table(cut(nutrimenu$Proteines, c(-1, 6, 14)))

Notice how we can use this trick in a graph!

ggplot(nutrimenu, aes(cut(Proteines, c(-1, 6, 14)))) +
  geom_bar() + 
  theme_bw()

It is not very pretty as a command and a little difficult to read. Sometimes, rather than nesting commands, it's better to create a new variable and use that new variable in the graphics command.

Represent a continuous variable

Complete the following function to carry out a boxplot of the variable Energie.

ggplot(nutrimenu) + geom
ggplot(nutrimenu, aes(Energie)) + 
  geom_boxplot() + 
  theme_bw()

On the same model, make a violin plot

ggplot(nutrimenu) + geom
ggplot(nutrimenu, aes(x = Energie, y = 0)) + 
  geom_violin() + 
  theme_bw()

Overlap the two and adjust the width and color settings to have a nice graph.

ggplot(nutrimenu) + 
  geom_violin() +
  geom_boxplot()
## Use the theme function
### - on the axis.text.y element
### - and on the axis.ticks.y element
### - which you will need to turn into element_blank ()
ggplot(nutrimenu, aes(x = Energie, y = 0)) + 
  geom_violin(color = NA, fill = "limegreen") + 
  geom_boxplot(width = 0.1, color = "steelblue", fill = "lightgreen", size = 1) + 
  theme_bw() + 
  labs(x = "", y = "Energie (kCal/100 g)") +    
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

Now make a histogram of the lipid composition.

ggplot(nutrimenu) + 
  geom
ggplot(nutrimenu, aes(Lipides)) + 
  geom_histogram(breaks = seq(0, 30, 5), color = "white", fill = "goldenrod") + 
  theme_bw() + 
  labs(x = "Lipides (g/100 g)", y = "Nb. d'occurences")

Represent on the same graph a qualitative variable and a quantitative variable

Run the following command then modify it so that each box is a different color.

ggplot(nutrimenu, aes(Type, Glucides)) + geom_boxplot()
ggplot(nutrimenu, aes(Type, Glucides, color = Type)) + 
  geom_boxplot() + 
  theme_bw() + 
  coord_flip() +
  labs(x = "", y = "Glucides (g/100 g)", color = "")

I selected 3 colors for you, and used the following command to store them in a vector called couleurs.

couleurs <- c("steelblue", "limegreen", "darkolivegreen")

This vector is used below. Run this code and observe the result. Then try other colors of your choice: use color names (see the result of the colors() function) or hexadecimal codes.

couleurs <- c("steelblue", "limegreen", "darkolivegreen")
ggplot(nutrimenu, aes(Fibres, Type, fill = Type)) + 
  geom_violin() + 
  scale_fill_manual(values = couleurs)

Scatterplot

Complete the following command to represent the energy according to the carbohydrate content of the recipes.

ggplot() + geom
ggplot(nutrimenu, aes(Glucides, Energie)) +
  geom_point() + 
  theme_bw() + 
  labs(y = "Energie (kCal / 100g)", x = "Glucides (g/100 g)")

Color the points, first with the Proteines variable and then with the Type variable. What do you notice?

ggplot() + geom
ggplot(nutrimenu, aes(Glucides, Energie, color = Proteines)) +
  geom_point() + 
  theme_bw() + 
  labs(y = "Energie (kCal / 100g)", y = "Glucides (g/100 g)")

ggplot(nutrimenu, aes(Glucides, Energie, color = Type)) +
  geom_point() + 
  theme_bw() + 
  labs(y = "Energie (kCal / 100g)", y = "Glucides (g/100 g)")

You can change colors defined on a quantitative variable with the function scale_color_continuous, or scale_color_gradient, or scale_color_gradientn, or colors defined on a discrete variable with the scale_color_manual or scale_colour_brewer function. Test these functions on the previous example.

Combine graphs

The functionalities of the ggpubr package allow, among other things, to combine graphs from ggplot objects.

Note: other packages exist to combine ggplot objects into complex graphs, like cowplot, gridExtra or patchwork, which is a testament to the success of ggplot2.

Run the following commands and see how they allow the creation of a complex graph in multiple panels.

library(ggpubr)
theme_set(theme_bw())

# boxplots
bxp <- ggplot(nutrimenu, aes(Type, Glucides, color = Type)) + geom_boxplot()
# jitter plots
jp <- ggplot(nutrimenu, aes(x = Type, y = Glucides, color = Type)) + geom_jitter()
# violin plots
vp <- ggplot(nutrimenu, aes(Type, Glucides, color = Type, fill = Type)) + geom_violin() 
# violin plots
dens <- ggplot(nutrimenu, aes(Glucides, fill = Type)) + geom_density() 

# Arrange
ggarrange(bxp, jp, vp, dens, ncol = 2, nrow = 2,  common.legend = TRUE, legend = "bottom") 

Here are the key arguments of the ggarrange function. Remember that you can access this information and more by calling the function's help file with the ?ggarrange command.

Argument | Use ---------- | -------- ncol | Number of columns in the graph grid. nrow | Number of rows in the graph grid. widths | Vector of the widths of the graphs. heights | Vector of the heights of the graphs. legend | Character string specifying the position of the legend ("top", "bottom", "left" or "right"). To remove the caption, use "none". common.legend | Logical value. The default is FALSE. If TRUE, a common unique legend will be created.



vguillemot/ReMUSE documentation built on Dec. 23, 2021, 3:09 p.m.