library(gradethis)
library(learnr)
library(qsslearnr)
library(tidyverse)
tutorial_options(exercise.checker = gradethis::grade_learnr)
knitr::opts_chunk$set(echo = FALSE)
tut_reptitle <- "QSS Tutorial 4: Output Report"
data(STAR, package = "qss")
star <- STAR

Conceptual Questions

quiz(caption = "",
     question("In section 3.6 of QSS, what kind of plot are we using to visualize ideological differences between Democrats and Republicans?",
              answer("scatterplot", correct = TRUE),
              answer("bar plot"),
              answer("histogram")),
     question("To estimate the correlation between two variables x and y, we need to know the mean of each as well as the:",
              answer("maximum"),
              answer("standard deviation", correct = TRUE),
              answer("length")),
     question("What type of relationships does correlation measure?",
              answer("linear relationships", correct = TRUE),
              answer("nonlinear relationships"),
              answer("both"))
     )

More visualization

We'll continue to analyze data from the STAR project, which is a four-year randomized trial on the effectiveness of small class sizes on education performance. The star data frame as been loaded into your space so that you can play around with it a bit.

Boxplots

Boxplots are useful tools to visualize how the distribution of a continuous variable changes across levels of a categorical variable. There are several ways to specify a boxplot, but the most basic way to construct with tidyverse is as follow:

ggplot(star, aes(y = cont.var, x = cat.var)) +
  geom_boxplot()

Here in the ggplot function, the first argument is name of the data frame that contains the two variables of interest, followed by aes to specify them on the boxplot. Most importantly, we want to specify the function geom_boxplot to produce a boxplot. Note that you can omit the star argument in the ggplot if you pass the data frame using the %>% operator.

Exercises

We can start by factorizing the classtype variable by using the mutate and the as.factor functions. Next, pass the expression to the ggplot function using the %>% operator. Include informative x-axis and y-axis labels with the labs function. To fine tune the graph, we can use the scale_x_discrete function to give the class types more intuitive names.

## factorize the variable `classtype`


## create a box plot with the characteristics specified in the instructions
star %>% 
  mutate(classtype = as.factor(classtype)) %>%
  ggplot(aes(x = classtype, y = g4math)) + 
  geom_boxplot() +
  labs(y = "Math Scores", x = "Class Type") +
  scale_x_discrete(labels = c("Small class", "Regular class", "Regular class with aid"))

Scatter plots

Now you'll think more about to measure bivariate relationships---that is, the relationship between two variables. The ggplot function takes the data argument followed by aes to specify the variable-of-interests, more importantly, specify geom_point to create a scatter plot as such: ggplot(data, aes(x = variable, y = variable)) + geom_point(). You will use this plot to explore the relationship between math and reading test scores in the star data.

Exercises

## produce a scatterplot of g4math on the x-axis and g4reading on the y-axis
## produce a scatterplot of g4math on the x-axis and g4reading on the y-axis
ggplot(star, aes(x = g4math, y = g4reading)) +
  geom_point()
grade_code("OK, great plot! Let's make it a bit more polished.")

Plotting two sets of points

Often we want to plot certain pints differently than others. For instance, maybe you want to see how the relationship between math and reading scores differs between students in small classes versus those not in small classes. To do this, we can subset the target classtypes with filter then specify colors of the different classtypes with aes within the geom_point function.

Before we do so, make sure to factorize the classtype variable for ggplot to create the graph.

Exercises


## tidyverse
star %>% 
  mutate(classtype = as.factor(classtype)) %>%
  filter(classtype %in% c("1", "2")) %>%
  ggplot(aes(x = g4math, y = g4reading)) + 
  geom_point(aes(color = classtype)) 
grade_this_code()

Finalizing your scatter plot

The scatter plot is looking very good, but it could use a little bit of polish. Let's add axis labels and a title.

Exercises


star %>% 
  mutate(classtype = as.factor(classtype)) %>%
  filter(classtype %in% c("1", "2")) %>%
  ggplot(aes(x = g4math, y = g4reading)) +
  geom_point(aes(color = classtype)) +
  labs(x = "Fourth Grade Math Scores", 
       y = "Fourth Grade Reading Score", 
       title = "Math vs Reading")

Submit

submission_ui
submission_server()


mattblackwell/qsslearnr documentation built on Sept. 17, 2022, 6:25 p.m.