library(gradethis)
library(learnr)
library(qsslearnr)
tutorial_options(exercise.checker = gradethis::grade_learnr)
knitr::opts_chunk$set(echo = FALSE)
tut_reptitle <- "QSS Tutorial 1: Output Report"
data(resume, package = "qss")

Conceptual Questions

quiz(
  caption = "",
  question(
    "Suppose a variable is binary, that is, it takes on values of either 0 or 1 (for example, female gender). Which of the following is the same as its sample mean?",
    answer("the sample median"),
    answer("the sample proportion of 1s", correct = TRUE),
    answer("neither of these")
  ),
  question(
    "What kind of value is `FALSE`?",
    answer("character"),
    answer("logical", correct = TRUE),
    answer("binary"),
    answer("numeric")
  ),
  question(
    "In order to calculate the mean of a variable we have used the `length()` function in the denominator. The `length()` of a vector is equivalent to:",
    answer("the number of elements", correct = TRUE),
    answer("the height"),
    answer("the maximum")
  ),
  question(
    "How are factor variables different from categorical variables?",
    answer("They are the same", correct = TRUE),
    answer("Factor variables contain numeric values"),
    answer("Categorical variables tend to have more levels or categories")
  )
)

Working with Logicals in R

Exploring the resume data

In this tutorial, we are going to be working with the resume data from Section 2.1 of QSS. This data comes from an experiment where researchers sent fictitious resumes with different names that implied different race and gender combinations to see if potential employers were more likely to call back names associated with different racial groups and genders.

Let's first explore the data a bit. It's stored as resume.

Exercise

## print the first 6 lines of the data
head(resume)
grade_code()

dim(resume)
grade_code()

summary(resume)
grade_code()

Creating a cross tab

To help you analyze this data, you can use a cross tabulation. Cross tabulation (or contingency table) is a table that quickly summarizes categorical data. For instance, in the resume data, we have a sex variable that tells us whether or not the fictitious resume had a male or a female name.

Exercise


grade_result(
  pass_if(~ identical(.result, table(resume$sex, resume$call))),
  pass_if(~ identical(.result, table(resume$call, resume$sex)))
)

Logical values

Pretty soon, you'll be doing more complicated subsetting in R. To do this, it's helpful to understand a special type of object in R: the logical. There are two values associated with this type of object: TRUE and FALSE (where the uppercase is very important).

Exercises

## creat a vector with two TRUE values and two FALSE values
x <-

## take the sum of this vector
x <- c()

sum(x)
grade_result_strict(
  pass_if(~ identical(x, c(TRUE, TRUE, FALSE, FALSE))),
  pass_if(~ identical(.result, 2L))
)
## creat a vector with one TRUE values and three FALSE values
z <-

## take the mean of this vector
z <- c()

mean(z)
grade_result_strict(
  pass_if(~ identical(z, c(TRUE, FALSE, FALSE, FALSE))),
  pass_if(~ identical(.result, 0.25))
)

Comparing logicals

We often combine logical statements using AND (&) and OR (|) in R. For AND statements, both expressions have to be true for the whole expression to be true:

For OR statements, either statement being true makes the whole expression true:

question("What does expression `(TRUE | FALSE) & TRUE` evaluate to?",
  answer("`TRUE`", correct = TRUE),
  answer("`FALSE`"),
  answer("`NA`")
)

Comparing objects

There are several relational operators that allow us to compare objects in R. The most useful of these are the following:

When we use these to compare two objects in R, we end us with a logical object. You can also compare a vector to a particular number.

Exercises


10 > 5
grade_code()
## x vector
x <- c(-2, -1, 0, 1, 2)

## test which values of x are greater than or equal to 0
grade_result(
  fail_if(~ identical(.result, x > 0),
          "Did you forget the 'or equal to' part of the comparison?"),
  pass_if(~ identical(.result, x >= 0))
)

Subsets in R

Subsetting a data frame

You can use the same logical statements you have been using to create subsets of a data frame. These can often be helpful because we'll want to calculate various quantities of interest for different subsets of the data. For this exercise, we will use the resume data frame made up of the variables firstname, sex, race, and call. As a reminder, here is what the data look like:

resume

Exercise

## create the subset for white female names and
## assign it to resume.wf
resume.wf <- ...

## print the first 6 lines of the subset


## calculate the mean of the callback variable (call)
resume.wf <- subset(resume, subset = (race == "white" & sex == "female"))
head(resume.wf)
mean(resume.wf$call)
grade_result(
  pass_if(~ identical(.result, mean(subset(resume, subset = (race == "white" & sex == "female"))$call)))
)

Comparing means across treatment conditions

You can use the same ideas as in the last step to create a different subset of the data corresponding to white-sounding female names. Then, you can compare the average callback for the white-female names to the average callback for the black-female names. This will give you a sense of how the employer callback rate varies by racial group of the applicant for females.

Exercise

## create the subset for white female names
resume.wf <- subset(resume, subset = (race == "white" & sex == "female"))

## create the subset for black female names
resume.bf <-

## calculate the difference in callback means
## create the subset for white female names
resume.wf <- subset(resume, subset = (race == "white" & sex == "female"))

## create the subset for black female names
resume.bf <- subset(resume, subset = (race == "black" & sex == "female"))

## compare the difference in means
mean(resume.wf$call) - mean(resume.bf$call)
grade_code("You just analyzed an experiment! Way to go!")

Submit

submission_ui
submission_server()


mattblackwell/qsslearnr documentation built on Sept. 17, 2022, 6:25 p.m.