knitr::opts_chunk$set(echo = TRUE, eval = TRUE, warning=FALSE, message=FALSE, fig.width=7)
Please note: The graphs shown at each question is only a suggested plot for the solution. You may want to reproduce it or create a diiferent plot to answer the corresponding question.
To get you familiar with the underlying ggplot2
concepts, we'll recreate
some standard graphics. Some of these plots aren't particularly useful, we are
just using them for illustration purposes.
To begin with, load the ggplot2
library("ggplot2")
Next we load the movies
data set
# Details of the movies dataset can be displayed by: library(ggplot2movies) data(movies, package="ggplot2movies") ?movies
When loading in data, it's a good idea to check some basic characteristics:
str(movies) dim(movies) names(movies) head(movies)
Feel free to experiment with your own ideas. I present some graphs as a reference that you may try to reproduce if you wish.
g = ggplot(data=movies, aes(x=year)) g1 = g + geom_histogram(binwidth = 1, fill="#2b8cbe", alpha=0.6) + xlab("Year") + ylab("Number of movies produced")
g1
# TIP: You need first to create a genre variable: genre <- rep(0, nrow(movies)) for(i in 18:24) { genre[movies[,i]==1] <- names(movies)[i] }; genre[genre==0] <- "Unknown" movies$Genre <- genre
# define a vector for colors to be used in the plot clr <- c('#8dd3c7','#ffffb3','#bebada','#fb8072','#80b1d3','#fdb462','#b3de69', '#000000') ggplot(movies, aes(x=year, fill=Genre)) + geom_histogram(binwidth=1) + scale_fill_manual(values = clr) + xlab("Year") + ylab("Number of movies produced") + ylim(0,1900) + xlim(1890,2005)
ggplot(movies, aes(x=rating)) + geom_density(fill="#2b8cbe", alpha=0.6) + ylab("Density of Movie Rating") + xlab("Score (out of 10)")
ggplot(movies, aes(x=factor(Genre), y=rating)) + xlab("") + ylab("Rating (out of 10") + geom_violin(fill="red", alpha=0.4) + stat_summary(fun.y = median, geom='point')
ggplot(movies, aes(x=votes, y=rating)) + xlab("Votes") + ylab("Rating") + stat_binhex() + scale_fill_gradient(low="lightblue", high="red", breaks=c(0, 1500, 3000, 5000), limits=c(0, 5000))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.