library(learnr)
library(testwhat)

knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
tutorial_options(exercise.checker = testwhat_learnr)

df <- POL306::state_cig

A Note on Data

Throughout this tutorial we will use a dataset that looks at healthcare spending, cigarette taxes, and a variety of political variables. The unit of analysis for it is the state-year, and the data comes from 2001 to 2014. The variables are:

ggplot

In this exercise we are going to use the package ggplot2 to make plots. This library is widly used to make plots and has a lot of customizability that makes it powerful. It also uses a specific way of creating plots that involves adding together functions.

Each ggplot2 call though starts with the function ggplot() and includes the data you are going to use in the data= attribute.

Instructions:

Start now by running loading the ggplot2 library library(ggplot2) and running the initial function ggplot(data=df) (df is the data we are using here).


df <- POL306::state_cig ### Ignore this
library(ggplot2)
ggplot(data=df)
ex() %>% check_function("ggplot") %>% {
  check_arg(., "data")
}

Histogram

You successfully made a grey box! Now lets start filling that grey box with a plot. You add geom_*() functions to build your plot, and in them you tell R what variables to use. You do that by mapping aesthetics to different variables in your data.

For example, a histogram is geom_histogram() and if you want it to make a histogram of the cigarette tax on the x axis you'd write: geom_histogram(mapping=aes(x=cig_tax)). Note that we are assigning an aesthetics (aes()) to a mapping= for geom_histogram().

Putting this together with what we have before is:

ggplot(data=df) + geom_histogram(mapping=aes(x=cig_tax))

Instructions:

Create a histogram of the percentage of female legislators in a state (female_leg)

library(ggplot2)
df <- POL306::state_cig ### Ignore this

library(ggplot2)
ggplot(data=df) + geom_histogram(mapping=aes(x=female_leg))
ex() %>% {
  check_function(., "ggplot") %>% check_arg("data") %>% check_equal()
  check_function(., "aes") %>% {
    check_arg(., "x") %>% check_equal(eval = FALSE) 
  }
  check_function(., "geom_histogram")
}

Density Plots

The powerful thing with ggplot2 is that you can easily switch the way your data is displayed by changing the geom_*() function. Along with geom_histogram() there is an geom_density() that creates a density plot.

Instructions:

Switch your histogram of percentage of female legislators into a density plot.

library(ggplot2)
df <- POL306::state_cig ### Ignore this

library(ggplot2)
ggplot(data=df) + geom_density(mapping=aes(x=female_leg))
ex() %>% {
  check_function(., "ggplot") %>% check_arg("data") %>% check_equal()
  check_function(., "aes") %>% {
    check_arg(., "x") %>% check_equal(eval = FALSE) 
  }
  check_function(., "geom_density")
}

Customizing Labels and Titles

To create titles and labels for your plot you use the labs() function which takes arguments for the x axis (x=), the y axis (y=) and the main title (title=). For example to set the main title you would add on labs(title="Histogram of Variable"). Be sure to put the language you want to use in quotations, and separate each argument with a comma (if you use multiple).

Instructions:

Provide an x and y label and a title for the histogram of female legislators. B

library(ggplot2)
df <- POL306::state_cig ### Ignore this

library(ggplot2)
ggplot(data=df) + geom_histogram(mapping=aes(x=female_leg)) + 
  labs(y="Frequency", x="Percent Female", title="Histogram of Percent Female Legislators")
ex() %>% {
  check_function(., "ggplot") %>% check_arg("data") %>% check_equal()
  check_function(., "aes") %>% {
    check_arg(., "x") %>% check_equal(eval = FALSE) 
  }
  check_function(., "geom_histogram") 
  check_function(., "labs") %>%{
    check_arg(., "x")
    check_arg(., "y")
    check_arg(., "title")
  }
}

Scatter Plots

Making a scatter plot in ggplot2 is relatively easy as well, you just need to switch to geom_point() (because you are making points) and add a y variable along with your x variable (something like aes(x=var1, y=var2)).

Instructions:

Make a scatter plot with cig_tax on the y axis and female_leg on the x. Be sure to set labels for the x and y axis along with a title for the plot.

library(ggplot2)
df <- POL306::state_cig ### Ignore this

ggplot(data=df) + geom_point(mapping=aes(x=female_leg, y=cig_tax)) + 
  labs(y="Cigarette Tax", x="Percent Female", 
       title="Cig Tax vs Female Legislature")
ex() %>% {
  check_function(., "ggplot") %>% check_arg("data") %>% check_equal()
  check_function(., "geom_point")
  check_function(., "aes") %>% {
    check_arg(., "x") %>% check_equal(eval = FALSE) 
    check_arg(., "y") %>% check_equal(eval = FALSE) 
  }
  check_function(., "labs") %>%{
    check_arg(., "x")
    check_arg(., "y")
    check_arg(., "title")
  }
}

Other Aesthetics

ggplot has a wide variety of ways to customize a plot, and these can be used to either show more information about your observations or just to make things look pretty.

For example you can change the colors of the points using the color= argument. If you set this in the aes() function and assign it to a variable then it will color the points based on that variable. You could also just set it outside the aes() function but inside the geom_point() function to any color like color="green" to set all the points to green. Concretely the difference looks like:

### Color the dots based on how liberal a state is
ggplot(data=df) + 
  geom_point(mapping=aes(x=female_leg, y=cig_tax, 
                          color=liberal)) 
### Color all the dots red
ggplot(data=df) + 
  geom_point(mapping=aes(x=female_leg, y=cig_tax), 
                          color="red") 

Along with color you can set:

And more

Instructions:

Start with your scatter plot with cig_tax on the y axis and female_leg on the x and then set the color to the year variable, but make all dots a little transparent by setting alpha=0.5.

library(ggplot2)
ggplot(data=df) + geom_point(mapping=aes(x=female_leg, y=cig_tax))
df <- POL306::state_cig ### Ignore this

ggplot(data=df) + geom_point(mapping=aes(x=female_leg, y=cig_tax, 
                                         color=year), alpha=0.5)  
ex() %>% {
  check_function(., "ggplot") %>% check_arg("data") %>% check_equal()
  check_function(., "geom_point") %>% check_arg("alpha") %>% check_equal()
  check_function(., "aes") %>% {
    check_arg(., "x") %>% check_equal(eval = FALSE) 
    check_arg(., "y") %>% check_equal(eval = FALSE) 
    check_arg(., "color") %>% check_equal(eval = FALSE) 
  }
}


reuning/POL306 documentation built on Sept. 19, 2023, 10:47 a.m.