library(gradethis)
library(learnr)
library(qsslearnr)
tutorial_options(exercise.checker = gradethis::grade_learnr)
knitr::opts_chunk$set(echo = FALSE)
tut_reptitle <- "QSS Tutorial 0: Output Report"

Basics of R

R as a calculator

First, we'll learn how to use R as a calculator.


grade_result(
  pass_if(~ identical(.result, 8))
)

grade_result(
  pass_if(~ identical(.result, 2))
)

grade_result(
  pass_if(~ identical(.result, 6/2))
)

grade_result(
  pass_if(~ identical(.result, sqrt(16)), "Now you know how to use R as a calculator.")
)

Storing results

You can save anything in R to an object. This is handy if you want to reuse some calculation later in your session.

## assign the difference here
mydiff <- ...

## print the value of mydiff on the next line
grade_result_strict(
  pass_if(~ identical(mydiff, 7)),
  pass_if(~ identical(.result, 7))
)

Characters and strings

A lot of the time we'll work with numbers in R, but we will also want to use a lot of text. This text can be helpful in producing labels for plots or for labeling categorical variables.

## save the first string
course <- ...

## overwrite the course variable with the second phrase
course <- ...

## print the value of course on the next line
grade_result(
  pass_if(~ identical(.result, "learning R"))
)

Copying and reassigning variables

When we assign an existing object to a new name we always make a copy of it. This can be useful when you want it, but it also means you can lose what's in your object if you accidentally overwrite it. Here, we are going to learn about creating a copy of an object before overwriting it.

result <- 8 - 2
## First result
result <- 8 - 2

## Assign the value of "result" to "result2"


## Overwrite "result"
result <- ...

## Print result on next line
result2 <- result
grade_result_strict(
  pass_if(~ identical(result2, 8 - 2)),
  pass_if(~ identical(.result, 10 - 2)),
  pass_if(~ identical(result, 10 - 2))
)

Working with Data

Working with real data

Next, we are going to start working with real data: estimates of world population (in thousands). A vector of data called world.pop has been loaded with this lesson. The first element is for the year 1950 up to the last element for 2010. You can see that we create the vector by using the c() function which concatenates multiple values together into one vector. We enter the data one value at a time, each separated by a comma.

## create the world.pop data
world.pop <- c(2525779, 3026003, 3691173, 4449049, 5320817, 6127700, 6916183)

## print the world.pop data
world.pop

Indexing and subsetting

Vectors are just a series of objects in R that are all stored together in a specific order. What if we want to access a specific value in the vector? Well, for that, we can use the indexing and subsetting tools in R. Specifically, we will use the the square brackets, [ ] to access specific values within the vector.

world.pop <- c(2525779, 3026003, 3691173, 4449049, 5320817, 6127700, 6916183)
## access and print the 4th value of world.pop
grade_result(
  pass_if(~ identical(.result, world.pop[4]))
)
## access and print the 1st and 4th value of world.pop
grade_result(
  pass_if(~ identical(.result, world.pop[c(1,4)]))
)

Using functions

One way we will use R a ton is through functions. Functions are the bread and butter of R. They allow us to act on or get information about vectors and other objects. For instance, the following functions are pretty useful for any vector:

Instructions

## calculate the number of entries in world.pop
length(world.pop)
grade_code()
## calculate the minimum value of world.pop
min(world.pop)
grade_code()
## calculate the average value of world.pop
mean(world.pop)
grade_code()

Creating and using sequences

Creating vectors using the c() command can be cumbersome and time consuming. Sometimes we can create vectors much more quickly. One place where we can do this is with sequences of numbers that follow a pre-specified pattern. In that case, we can use the seq() function. This function most commonly takes three arguments:

We're going to use this to create a label for the world.pop vector. We can assign these labels using the names() function.

Instructions

## create a vector with a sequence from 1950 to 2010 by 10
year <- ...

names(world.pop) <- ...

world.pop
grade_result_strict(
  pass_if(~ identical(year, seq(1950, 2010, by = 10))),
  pass_if(~ identical(.result, c("1950" = 2525779, "1960" = 3026003, "1970" = 3691173, "1980" = 4449049, "1990" = 5320817, "2000" = 6127700, "2010" = 6916183)))
)

Replacing values in a vector

Indexing and subsetting allow you to access specific values in the vector, but you can also use the same syntax to replace certain values in the vector. That is we can assign a value such as x[4] <- 50, which would replace the fourth entry in the x vector with the number 50.

For example, suppose that your research assistant came running in to tell you that the earliest world population data was actually from 1945, not 1950. Here, you will fix this in your vector.

Instructions

year <- seq(from=1950, to=2010, by = 10)

## update the first entry of year


## print the modified year vector
grade_result(
  pass_if(~ identical(year[1], 1945))
)

Arithmetic with vectors

What if we wanted our data in millions of people? How would we create this vector from the vector that we have? (Recall that world.pop is currently in units of thousands of people.) One way would be to do this manually---create a new vector using c() that concatenates the world population in millions of people rather than thousands of people. But this is cumbersome, can't we use the vector we already have? Yes!

We can apply many types of arithmetic operators such as addition, subtraction, multiplication, and division to our vector. For example, the code x + 5 will add the number 5 to each value in the vector. In this exercise, we will create a new vector that is the world population in millions of people, which is just the total population in thousands divided by 1000.

Exercise

## create the pop.million variable.
pop.million <- ...

## print out the pop.million variable
grade_result(
  pass_if(~ identical(pop.million, world.pop / 1000))
)

Working with a data.frame

A data.frame is an object in R that is basically like a spreadsheet with some number of rows (units) and some number of columns (variables) and a name for each column. There are a number of ways to interact with a data.frame to get useful information about it. For example, if I have a data.frame called mydata, I can do the following:

These are super useful functions Let's use some of these on a data frame, UNpop, which has the same information as the world.pop vector, but stored as a data frame.

Exercise

UNpop <- data.frame(
  year = seq(1950, 2010, by = 10),
  world.pop = c(2525779, 3026003, 3691173, 4449049, 5320817, 6127700, 6916183)
)
## Print the UNpop data frame
## Print the UNpop data frame
UNpop
grade_code()
## Print the variable names of UNpop
names(UNpop)
grade_code()
## Print the dimensions of UNpop
## Print the UNpop data frame
dim(UNpop)
grade_code()
## Print a summary of the data in UNpop
summary(UNpop)
grade_code()

Subsetting a data frame (I)

You'll often need to access different parts of a data frame to use in other commands. For instance, maybe you want to take the mean of a column of the data frame or maybe you want to see all of the data for the 4th unit. Either way, we'll need to know how to subset the data frame. To select a particular variable from the data frame, you can use the $ operator. So mydata$myvar will be a vector of just the myvar column of the mydata data frame.

Exercise

## print out the world.pop variable using $
UNpop$world.pop
grade_code()
## calculate the mean world population over this time period
mean(UNpop$world.pop)
grade_code()

Subsetting a data frame (II)

You can also use the brackets [ ] to subset from the data frame. But how will R know if you want to subset the rows or the columns? With a data frame as opposed to a vector, you will use a comma and the bracket will have the following form: [rows, columns] where the expression before the comma will select the rows and the expression after the comma will select the columns.

Exercise

## use brackets to print out the world.pop variable
UNpop[, "world.pop"]
grade_code()
## extract rows 5 through 7 and all variables
UNpop[5:7, ]
grade_code()
## extract values 5 through 7 of the world.pop variable
UNpop[5:7, "world.pop"]
grade_code()

Submit

submission_ui
submission_server()


mattblackwell/qsslearnr documentation built on Sept. 17, 2022, 6:25 p.m.