library(learnr) knitr::opts_chunk$set(echo = FALSE) library(ggplot2) library(ggpubr) data("nutrimenu", package = "debuter") couleurs <- c("steelblue", "limegreen", "darkolivegreen", "darkorchid")
The ggplot2
package allows you to easily create graphs with a unified syntax. In this tutorial, we are going to practice with the "nutrimenu" data.
The basic command will look like this
ggplot(data, aes(x, y)) + geom_****()
Where
ggplot
draws an empty canvasdata
is a data-frame containing all the information needed to draw the graphaes()
is a function which allows to declare the aesthetic parameters of the graph (the coordinates of points, lines or bars, the way to color them, the labels etc ...)geom_****
is a function that specifies a "drawing layer"In this tutorial, we will see the following functions:
Function | Result
---------|---------
geom_bar
, geom_col
| Bar plot
geom_point
, geom_jitter
| Scatterplot
geom_line
| Line plot
geom_histogram
| Histograms
geom_boxplot
| Boxplot
geom_density
| Density plot
geom_violin
| Violin plot
While doing the exercises, familiarize yourself with the documentation for ggplot2: https://ggplot2.tidyverse.org/
The data in the nutrimenu
table contains the nutritional data of r nrow nutrimenu)
recipes. The table is shown below.
DT::datatable(nutrimenu)
The "Type" columns contains the recipe type.
For more information, go to Nutriwi (https://www.nutriwi.com/).
Complete the following command to get a bar chart of recipe types
ggplot( ) + geom_
ggplot(nutrimenu, aes(Type)) + geom_bar()
Change the colors and the theme of the graph by completing the next command.
ggplot( , aes(fill = )) + geom_ + theme_
ggplot(nutrimenu , aes(Type, fill = Type)) + geom_bar() + theme_bw()
Use the cut
function as follows to transform the Proteines (continuous) variable into a qualitative variable.
cut(nutrimenu$Proteines, c(-1, 6, 14))
# There is nothing more to do: just run the command and observe its result cut(nutrimenu$Proteines, c(-1, 6, 14)) # If you want, you can use the table function on this result table(cut(nutrimenu$Proteines, c(-1, 6, 14)))
Notice how we can use this trick in a graph!
ggplot(nutrimenu, aes(cut(Proteines, c(-1, 6, 14)))) + geom_bar() + theme_bw()
It is not very pretty as a command and a little difficult to read. Sometimes, rather than nesting commands, it's better to create a new variable and use that new variable in the graphics command.
Complete the following function to carry out a boxplot of the variable Energie
.
ggplot(nutrimenu) + geom
ggplot(nutrimenu, aes(Energie)) + geom_boxplot() + theme_bw()
On the same model, make a violin plot
ggplot(nutrimenu) + geom
ggplot(nutrimenu, aes(x = Energie, y = 0)) + geom_violin() + theme_bw()
Overlap the two and adjust the width and color settings to have a nice graph.
ggplot(nutrimenu) + geom_violin() + geom_boxplot()
## Use the theme function ### - on the axis.text.y element ### - and on the axis.ticks.y element ### - which you will need to turn into element_blank ()
ggplot(nutrimenu, aes(x = Energie, y = 0)) + geom_violin(color = NA, fill = "limegreen") + geom_boxplot(width = 0.1, color = "steelblue", fill = "lightgreen", size = 1) + theme_bw() + labs(x = "", y = "Energie (kCal/100 g)") + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank())
Now make a histogram of the lipid composition.
ggplot(nutrimenu) + geom
ggplot(nutrimenu, aes(Lipides)) + geom_histogram(breaks = seq(0, 30, 5), color = "white", fill = "goldenrod") + theme_bw() + labs(x = "Lipides (g/100 g)", y = "Nb. d'occurences")
Run the following command then modify it so that each box is a different color.
ggplot(nutrimenu, aes(Type, Glucides)) + geom_boxplot()
ggplot(nutrimenu, aes(Type, Glucides, color = Type)) + geom_boxplot() + theme_bw() + coord_flip() + labs(x = "", y = "Glucides (g/100 g)", color = "")
I selected 3 colors for you, and used the following command to store them in a vector called couleurs
.
couleurs <- c("steelblue", "limegreen", "darkolivegreen")
This vector is used below. Run this code and observe the result. Then try other colors of your choice: use color names (see the result of the colors()
function) or hexadecimal codes.
couleurs <- c("steelblue", "limegreen", "darkolivegreen") ggplot(nutrimenu, aes(Fibres, Type, fill = Type)) + geom_violin() + scale_fill_manual(values = couleurs)
Complete the following command to represent the energy according to the carbohydrate content of the recipes.
ggplot() + geom
ggplot(nutrimenu, aes(Glucides, Energie)) + geom_point() + theme_bw() + labs(y = "Energie (kCal / 100g)", x = "Glucides (g/100 g)")
Color the points, first with the Proteines variable and then with the Type variable. What do you notice?
ggplot() + geom
ggplot(nutrimenu, aes(Glucides, Energie, color = Proteines)) + geom_point() + theme_bw() + labs(y = "Energie (kCal / 100g)", y = "Glucides (g/100 g)") ggplot(nutrimenu, aes(Glucides, Energie, color = Type)) + geom_point() + theme_bw() + labs(y = "Energie (kCal / 100g)", y = "Glucides (g/100 g)")
You can change colors defined on a quantitative variable with the function scale_color_continuous
, or scale_color_gradient
, or scale_color_gradientn
, or colors defined on a discrete variable with the scale_color_manual
or scale_colour_brewer
function. Test these functions on the previous example.
The functionalities of the ggpubr
package allow, among other things, to combine graphs from ggplot
objects.
Note: other packages exist to combine ggplot
objects into complex graphs, like cowplot
, gridExtra
or patchwork
, which is a testament to the success of ggplot2
.
Run the following commands and see how they allow the creation of a complex graph in multiple panels.
library(ggpubr) theme_set(theme_bw()) # boxplots bxp <- ggplot(nutrimenu, aes(Type, Glucides, color = Type)) + geom_boxplot() # jitter plots jp <- ggplot(nutrimenu, aes(x = Type, y = Glucides, color = Type)) + geom_jitter() # violin plots vp <- ggplot(nutrimenu, aes(Type, Glucides, color = Type, fill = Type)) + geom_violin() # violin plots dens <- ggplot(nutrimenu, aes(Glucides, fill = Type)) + geom_density() # Arrange ggarrange(bxp, jp, vp, dens, ncol = 2, nrow = 2, common.legend = TRUE, legend = "bottom")
Here are the key arguments of the ggarrange
function. Remember that you can access this information and more by calling the function's help file with the ?ggarrange
command.
Argument | Use
---------- | --------
ncol
| Number of columns in the graph grid.
nrow
| Number of rows in the graph grid.
widths
| Vector of the widths of the graphs.
heights
| Vector of the heights of the graphs.
legend
| Character string specifying the position of the legend ("top"
, "bottom"
, "left"
or "right"
). To remove the caption, use "none"
.
common.legend
| Logical value. The default is FALSE
. If TRUE
, a common unique legend will be created.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.