If you attended the Introduction to R course with us, you will already be familiar with some of the basic ggplot2 concepts. This practical serves as a reminder on some of those concepts, whilst also introducing some new ones. If you didn't attend this course, use this practical as a introduction to the basic concepts of ggplot2. Some these of plots aren't particularly useful, we are just using them for illustration purposes.

\newthought{To begin with}, load the ggplot2 package^[The ggplot2 package is automatically installed with jrGgplot2.]

library("ggplot2")

\noindent Next we load the beauty data set:^[Details of the beauty data set can be found at the end of this practical.]

library("jrGgplot2")
data(Beauty, package = "jrGgplot2")

\noindent When loading in data, it's always a good idea to carry out a sanity check. I tend to use the commands

head(Beauty)
colnames(Beauty)
dim(Beauty)

Scatter plots

Scatter plots are created using the point geom. Let's start with a basic scatter plot

ggplot(data = Beauty) +
  geom_point(aes(x = age, y = beauty))

\noindent To save typing, we can also store the plot as a variable:^[In this practical, we are creating the plots in a slightly verbose way.]

g = ggplot(data = Beauty)
g1 = g + geom_point(aes(x = age, y = beauty))

\noindent To view this plot, type g1.

The arguments x and y are called aesthetics. For geom_point, these parameters are required. This particular geom has other aesthetics: shape, colour, size and alpha.^[These aesthetics are usually available for most geoms.] Here are some things to try out.

g + geom_point(aes(x = age, y = beauty, colour = gender))

\noindent or

g + geom_point(aes(x = age, y = beauty, colour = gender,
                   alpha = evaluation))

Some aesthetics, like shape must be discrete. So we have to transform the variable into a character or factor - shape = factor(tenured). - Are there any differences between numeric values like tenured and characters like gender for some aesthetics? What happens if you convert tenured to a factor in the colour aesthetic. For example, colour = factor(tenured). - What happens if you set colour (or some other aesthetic) outside of the aes function? For example, compare

g + geom_point(aes(x = age, y = beauty, colour = "blue"))

\noindent to

g + geom_point(aes(x = age, y = beauty), colour = "blue")

Box plots

The box plot geom has the following aesthetics: x, y, colour, fill, linetype, weight, size and alpha. We can create a basic boxplot using the following commands:

g + geom_boxplot(aes(x = gender, y = beauty))

\noindent Similar to the point geom, we can add in aesthetics:

g + geom_boxplot(aes(x = gender, y = beauty,
                     colour = factor(tenured)))

\noindent Why do you think we have to convert tenured to a discrete factor?

As before, experiment with the different aesthetics. For some of the aesthetics, you will need to convert the continuous variables to discrete variables. For example, this will give an error:

g + geom_boxplot(aes(x = gender, y = beauty, colour = tenured))

\noindent while this is OK^[Why?]

g + geom_boxplot(aes(x = gender, y = beauty,
                     colour = factor(tenured)))

\noindent Make sure you play about with the different aesthetics.

Combining plots

The key idea with ggplot2 is to think in terms of layers not in terms of plot "types".^[In the lectures we will discuss what this means.] For example,

g + geom_boxplot(aes(x = gender, y = beauty,
                          colour = factor(tenured))) +
  geom_point(aes(x = gender, y = beauty))
g + geom_boxplot(aes(x = gender, y = beauty,
                         colour = factor(tenured))) +
  geom_jitter(aes(x = gender, y = beauty))

Bar plots

The bar geom has the following aesthetics: x, colour, fill, size, linetype, weight and alpha. Here is a command to get started:

g + geom_bar(aes(x = factor(tenured)))
Beauty$dec = factor(signif(Beauty$age, 1))

\noindent then plot:

g = ggplot(data = Beauty)
g + geom_bar(aes(x = gender, fill = dec))

\noindent We can adjust the layout of this bar plot using ggplot's position adjustments. The five possible adjustments are listed in table 1. The default adjustment is stack

g + geom_bar(aes(x = gender, fill = dec),
             position = "stack")
g + geom_bar(aes(x = gender, fill = dec),
             position = "dodge")

\begin{table}[t] \centering \begin{tabular}{@{}ll@{}} \toprule Adjustment & Description \ \midrule \texttt{dodge} & Adjust position by overlapping to the side \ \texttt{fill} & Stack overlapping elements; standardise stack height\ \texttt{identity} & Do nothing \ \texttt{jitter} & Jitter points \ \texttt{stack} & Stack overlapping elements \ \bottomrule \end{tabular} \caption{Position adjustments - table 4.5 in the ggplot2 book.} \label{T1} \end{table}

\newpage

The beauty data set

\begin{table}[!t] \centering \caption{The first five rows of the beauty data set. There are a total of 463 course evaluations.} \begin{tabular}{@{}llllll r@{.}l@{}} \toprule tenured & minority & age & evaluation & gender & students & \multicolumn{2}{l}{beauty} \ \midrule 0 & 1 & 36 & 4.3 & Female & 43 & 0&202 \ 1 & 0 & 59 & 4.5 & Male & 20 & -0&826 \ 1 & 0 & 51 & 3.7 & Male & 55 & -0&660 \ 1 & 0 & 40 & 4.3 & Female & 46 & -0&766 \ 0 & 0 & 31 & 4.4 & Female & 48 & 1&421 \ \bottomrule \end{tabular} \label{T2} \end{table}

This data set is from a study where researchers were interested in whether a lecturers' attractiveness affected their course evaluation.\cite{Hamermesh2003} This is a cleaned version of the data set and contains the following variables:

Table 2 gives the first few rows of the data set.



jr-packages/jrGgplot2 documentation built on Sept. 20, 2020, 2:59 a.m.