First load the ggplot2 package
library("ggplot2")
and the OK Cupid data set
data(okcupid, package = "jrSouth")
geom_point()
command# alpha makes the points transparent ggplot(data = okcupid) + geom_point(aes(x = age, y = height), alpha = 0.2)
g = ggplot(data = okcupid) g1 = g + geom_point(aes(x = age, y = height), alpha = 0.2)
So now running g1
will produce the graph
g1
x
and y
here are called aesthetics. What do you think happens if you omit the y
aesthetic?# this gives an error
For geom_point()
, both the x
and y
aesthetics are required. This particular geom has other aesthetics: shape, colour, size and alpha.^[These are available for most geoms. For a collection of aesthetics see the relevant help pages.] For instance we can specify that we wish to map the variable sex
to a colour aesthetic by including it inside aes()
r
g + geom_point(aes(x = age, y = height, colour = sex))
Change colour = sex
to colour = height
. Why do you think there's a change in the legend?
geom_bar()
can be used to create a bar chart. It requires only one aesthetic and that is x
. For the provided aesthetic, the frequencies will be calculated and shown as a bar. For example
g + geom_bar(aes(x = body_type))
+ xlab("Body Type")
and + ylab("Total")
g + geom_bar(aes(x = body_type)) + xlab("Body Type") + ylab("Total")
colour
and fill
). # What happens if you only have colour or only fill? g + geom_bar(aes(x = body_type, colour = sex, fill = sex))
x
and y
axes using a coord_flip()
layer added to your graph.g + geom_bar(aes(x = body_type, colour = sex, fill = sex)) + coord_flip()
I am not too keen on how the female and male bars are displayed on top of each other. The argument to change these is position
. The default is stack
, for example we can put the bars next to each other using
r
g + geom_bar(aes(x = body_type, colour = sex, fill = sex), position = "dodge") +
coord_flip()
Other values you might try here are position = fill
, position = jitter
, or position = identity
.
What does the fill
position argument do?
g + geom_bar(aes(x = body_type, colour = sex, fill = sex), position = "fill") + coord_flip() # puts the values on a common scale (all sum to 1)
okcupid
data. The x = 1
in the code below lets us have just a single boxplot for all ages g = ggplot(okcupid) g + geom_boxplot(aes(x = 1, y = age))
x = 1
for x = smokes
to get a boxplot for each groupg + geom_boxplot(aes(x = smokes, y = age))
data(mpg, package="ggplot2") # mpg$drv = as.character(mpg$drv) # mpg[mpg$drv == "f",]$drv = "Front" # mpg[mpg$drv == "r",]$drv = "Rear" # mpg[mpg$drv == "4",]$drv = "4wd" # mpg$drv = factor(mpg$drv, # levels = c("Front", "Rear", "4wd")) g = ggplot(data=mpg, aes(x=displ, y=hwy)) g1 = g + geom_point() + stat_smooth(linetype=2) + xlab("Displacement") + ylab("Highway mpg") g2 = g + geom_point() + stat_smooth(aes(colour=drv))
g1
The aim of this section is to recreate the graphics in figure 1 and 2. Feel free to experiment. To begin, load the package
library("ggplot2")
\noindent and the mpg
data set
data(mpg, package="ggplot2") dim(mpg)
1) Figure 1: Create a scatter plot of engine displacement,
displ
, against highway mpg, hwy
. To get started:
ggplot(data=mpg, aes(x=displ, y=hwy)) + geom_point() + xlab("Displacement")
Now add a dashed loess line and change the $y$-axis label.
Hint: try stat_smooth()
and ylab('New label')
.
g2
g1 = g + geom_point() + stat_smooth(linetype=2) + xlab("Displacement") + ylab("Highway mpg")
2) Figure 2: Using stat_smooth()
, add a loess line conditional
on the drive.
g2 = g + geom_point() + stat_smooth(aes(colour=drv))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.