library(gradethis) library(learnr) library(qsslearnr) library(tidyverse) tutorial_options(exercise.checker = gradethis::grade_learnr) knitr::opts_chunk$set(echo = FALSE) tut_reptitle <- "QSS Tutorial 4: Output Report" data(STAR, package = "qss") star <- STAR
quiz(caption = "", question("In section 3.6 of QSS, what kind of plot are we using to visualize ideological differences between Democrats and Republicans?", answer("scatterplot", correct = TRUE), answer("bar plot"), answer("histogram")), question("To estimate the correlation between two variables x and y, we need to know the mean of each as well as the:", answer("maximum"), answer("standard deviation", correct = TRUE), answer("length")), question("What type of relationships does correlation measure?", answer("linear relationships", correct = TRUE), answer("nonlinear relationships"), answer("both")) )
We'll continue to analyze data from the STAR project, which is a four-year randomized trial on the effectiveness of small class sizes on education performance. The star
data frame as been loaded into your space so that you can play around with it a bit.
Boxplots are useful tools to visualize how the distribution of a continuous variable changes across levels of a categorical variable. There are several ways to specify a boxplot, but the most basic way to construct with tidyverse is as follow:
ggplot(star, aes(y = cont.var, x = cat.var)) + geom_boxplot()
Here in the ggplot
function, the first argument is name of the data frame that contains the two variables of interest, followed by aes
to specify them on the boxplot. Most importantly, we want to specify the function geom_boxplot
to produce a boxplot. Note that you can omit the star
argument in the ggplot
if you pass the data frame using the %>%
operator.
We can start by factorizing the classtype
variable by using the mutate
and the as.factor
functions. Next, pass the expression to the ggplot
function using the %>%
operator. Include informative x-axis and y-axis labels with the labs
function. To fine tune the graph, we can use the scale_x_discrete
function to give the class types more intuitive names.
## factorize the variable `classtype` ## create a box plot with the characteristics specified in the instructions
star %>% mutate(classtype = as.factor(classtype)) %>% ggplot(aes(x = classtype, y = g4math)) + geom_boxplot() + labs(y = "Math Scores", x = "Class Type") + scale_x_discrete(labels = c("Small class", "Regular class", "Regular class with aid"))
Now you'll think more about to measure bivariate relationships---that is, the relationship between two variables. The ggplot
function takes the data
argument followed by aes
to specify the variable-of-interests, more importantly, specify geom_point
to create a scatter plot as such: ggplot(data, aes(x = variable, y = variable)) + geom_point()
. You will use this plot to explore the relationship between math and reading test scores in the star
data.
ggplot
and geom_point
function with g4math
on the x-axis and g4reading
on the y-axis, both from the star
data.## produce a scatterplot of g4math on the x-axis and g4reading on the y-axis
## produce a scatterplot of g4math on the x-axis and g4reading on the y-axis ggplot(star, aes(x = g4math, y = g4reading)) + geom_point()
grade_code("OK, great plot! Let's make it a bit more polished.")
Often we want to plot certain pints differently than others. For instance, maybe you want to see how the relationship between math and reading scores differs between students in small classes versus those not in small classes. To do this, we can subset the target classtypes with filter
then specify colors of the different classtypes with aes
within the geom_point
function.
Before we do so, make sure to factorize the classtype
variable for ggplot
to create the graph.
classtype
to factor by using the as.factor()
functionggplot
function using the %>%
operatorcolor
in the aes nested in geom_point
## tidyverse star %>% mutate(classtype = as.factor(classtype)) %>% filter(classtype %in% c("1", "2")) %>% ggplot(aes(x = g4math, y = g4reading)) + geom_point(aes(color = classtype))
grade_this_code()
The scatter plot is looking very good, but it could use a little bit of polish. Let's add axis labels and a title.
labs
function to make the graph more informative.star %>% mutate(classtype = as.factor(classtype)) %>% filter(classtype %in% c("1", "2")) %>% ggplot(aes(x = g4math, y = g4reading)) + geom_point(aes(color = classtype)) + labs(x = "Fourth Grade Math Scores", y = "Fourth Grade Reading Score", title = "Math vs Reading")
submission_ui
submission_server()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.