library(gradethis) library(learnr) library(qsslearnr) tutorial_options(exercise.checker = gradethis::grade_learnr) knitr::opts_chunk$set(echo = FALSE) tut_reptitle <- "QSS Tutorial 0: Output Report"
First, we'll learn how to use R as a calculator.
+
sign to add 5 and 3grade_result( pass_if(~ identical(.result, 8)) )
-
sign to subtract 3 from 5grade_result( pass_if(~ identical(.result, 2)) )
/
to divide 6 by 2grade_result( pass_if(~ identical(.result, 6/2)) )
sqrt()
function to take the square root of 16grade_result( pass_if(~ identical(.result, sqrt(16)), "Now you know how to use R as a calculator.") )
You can save anything in R to an object. This is handy if you want to reuse some calculation later in your session.
...
with the difference 8-1
to save it an object called mydiff
.mydiff
to the console by typing it on its own line.## assign the difference here mydiff <- ... ## print the value of mydiff on the next line
grade_result_strict( pass_if(~ identical(mydiff, 7)), pass_if(~ identical(.result, 7)) )
A lot of the time we'll work with numbers in R, but we will also want to use a lot of text. This text can be helpful in producing labels for plots or for labeling categorical variables.
"social science"
to the variable course
(replace the first ...
)."learning R"
to the same variable course
(replace the second ...
).course
to the console. What do you think it will say?## save the first string course <- ... ## overwrite the course variable with the second phrase course <- ... ## print the value of course on the next line
grade_result( pass_if(~ identical(.result, "learning R")) )
When we assign an existing object to a new name we always make a copy of it. This can be useful when you want it, but it also means you can lose what's in your object if you accidentally overwrite it. Here, we are going to learn about creating a copy of an object before overwriting it.
result
to result2
result
with 10 - 2
result <- 8 - 2
## First result result <- 8 - 2 ## Assign the value of "result" to "result2" ## Overwrite "result" result <- ... ## Print result on next line
result2 <- result
grade_result_strict( pass_if(~ identical(result2, 8 - 2)), pass_if(~ identical(.result, 10 - 2)), pass_if(~ identical(result, 10 - 2)) )
Next, we are going to start working with real data: estimates of world population (in thousands). A vector of data called world.pop
has been loaded with this lesson. The first element is for the year 1950 up to the last element for 2010. You can see that we create the vector by using the c()
function which concatenates multiple values together into one vector. We enter the data one value at a time, each separated by a comma.
world.pop
data by simply typing it into a line of code.## create the world.pop data world.pop <- c(2525779, 3026003, 3691173, 4449049, 5320817, 6127700, 6916183) ## print the world.pop data
world.pop
Vectors are just a series of objects in R that are all stored together in a specific order. What if we want to access a specific value in the vector? Well, for that, we can use the indexing and subsetting tools in R. Specifically, we will use the the square brackets, [ ]
to access specific values within the vector.
world.pop
data.world.pop <- c(2525779, 3026003, 3691173, 4449049, 5320817, 6127700, 6916183)
## access and print the 4th value of world.pop
grade_result( pass_if(~ identical(.result, world.pop[4])) )
world.pop
data. You will need to use the c()
function to create a vector of indices that you want to access.## access and print the 1st and 4th value of world.pop
grade_result( pass_if(~ identical(.result, world.pop[c(1,4)])) )
One way we will use R a ton is through functions. Functions are the bread and butter of R. They allow us to act on or get information about vectors and other objects. For instance, the following functions are pretty useful for any vector:
length(x)
calculates the number of elements/entries in the x
vector.min(x)
calculates the smallest value in the x
vector.max(x)
calculates the largest value in the x
vector .mean(x)
calculates the average value in the x
vector (that is the sum of the entries divided by the number of entries).sum(x)
calculates the sum of the values in the x
vector.world.pop
vector.## calculate the number of entries in world.pop
length(world.pop)
grade_code()
## calculate the minimum value of world.pop
min(world.pop)
grade_code()
## calculate the average value of world.pop
mean(world.pop)
grade_code()
Creating vectors using the c()
command can be cumbersome and time consuming. Sometimes we can create vectors much more quickly. One place where we can do this is with sequences of numbers that follow a pre-specified pattern. In that case, we can use the seq()
function. This function most commonly takes three arguments:
from
- the first number in the sequence.to
- the last number in the sequenceby
- the increments between each value in the sequence.We're going to use this to create a label for the world.pop
vector. We can assign these labels using the names()
function.
year
that is a sequence from 1950 to 2010 that increases in increments of 10 years and print it.year
vector to the names(world.pop)
to set the names of the world.pop
vector.## create a vector with a sequence from 1950 to 2010 by 10 year <- ... names(world.pop) <- ... world.pop
grade_result_strict( pass_if(~ identical(year, seq(1950, 2010, by = 10))), pass_if(~ identical(.result, c("1950" = 2525779, "1960" = 3026003, "1970" = 3691173, "1980" = 4449049, "1990" = 5320817, "2000" = 6127700, "2010" = 6916183))) )
Indexing and subsetting allow you to access specific values in the vector, but you can also use the same syntax to replace certain values in the vector. That is we can assign a value such as x[4] <- 50
, which would replace the fourth entry in the x
vector with the number 50.
For example, suppose that your research assistant came running in to tell you that the earliest world population data was actually from 1945, not 1950. Here, you will fix this in your vector.
year
vector with 1945.year <- seq(from=1950, to=2010, by = 10) ## update the first entry of year ## print the modified year vector
grade_result( pass_if(~ identical(year[1], 1945)) )
What if we wanted our data in millions of people? How would we create this vector from the vector that we have? (Recall that world.pop is currently in units of thousands of people.) One way would be to do this manually---create a new vector using c()
that concatenates the world population in millions of people rather than thousands of people. But this is cumbersome, can't we use the vector we already have? Yes!
We can apply many types of arithmetic operators such as addition, subtraction, multiplication, and division to our vector. For example, the code x + 5
will add the number 5 to each value in the vector. In this exercise, we will create a new vector that is the world population in millions of people, which is just the total population in thousands divided by 1000.
world.pop
vector by 1000 and assign it to a new vector called pop.million
.pop.million
variable.## create the pop.million variable. pop.million <- ... ## print out the pop.million variable
grade_result( pass_if(~ identical(pop.million, world.pop / 1000)) )
A data.frame
is an object in R that is basically like a spreadsheet with some number of rows (units) and some number of columns (variables) and a name for each column. There are a number of ways to interact with a data.frame
to get useful information about it. For example, if I have a data.frame
called mydata
, I can do the following:
names(mydata)
- returns the column (variable) names of the data.ncol(mydata)
- returns the number of columns in the data.nrow(mydata)
- returns the number of rows of the data.dim(mydata)
- returns a vector of the number of rows and the number of columns (the dimension of the data).summary(mydata)
- provides a summary of each variable in the data.These are super useful functions Let's use some of these on a data frame, UNpop
, which has the same information as the world.pop
vector, but stored as a data frame.
UNpop <- data.frame( year = seq(1950, 2010, by = 10), world.pop = c(2525779, 3026003, 3691173, 4449049, 5320817, 6127700, 6916183) )
UNpop
data frame.## Print the UNpop data frame
## Print the UNpop data frame
UNpop
grade_code()
UNpop
data frame.## Print the variable names of UNpop
names(UNpop)
grade_code()
dim
function report the number of rows and columns of the data frame.## Print the dimensions of UNpop
## Print the UNpop data frame dim(UNpop)
grade_code()
summary
function to show a summary of each variable.## Print a summary of the data in UNpop
summary(UNpop)
grade_code()
You'll often need to access different parts of a data frame to use in other commands. For instance, maybe you want to take the mean of a column of the data frame or maybe you want to see all of the data for the 4th unit. Either way, we'll need to know how to subset the data frame. To select a particular variable from the data frame, you can use the $
operator. So mydata$myvar
will be a vector of just the myvar
column of the mydata
data frame.
$
to print out the world.pop
variable from the UNpop
data frame.## print out the world.pop variable using $
UNpop$world.pop
grade_code()
$
operator, calculate the mean of the world.pop
variable.## calculate the mean world population over this time period
mean(UNpop$world.pop)
grade_code()
You can also use the brackets [ ]
to subset from the data frame. But how will R know if you want to subset the rows or the columns? With a data frame as opposed to a vector, you will use a comma and the bracket will have the following form: [rows, columns]
where the expression before the comma will select the rows and the expression after the comma will select the columns.
mydata[,"myvar"]
will select the myvar
column from mydata
mydata[1,]
will select the first row of mydata
mydata[c(1,2,3),]
will select the first three rows of mydata
mydata[1:3, "myvar"]
will select the first three values of the myvar
variable of mydata
world.pop
variable from the UNpop
data frame.## use brackets to print out the world.pop variable
UNpop[, "world.pop"]
grade_code()
UNpop
data frame.## extract rows 5 through 7 and all variables
UNpop[5:7, ]
grade_code()
world.pop
variable of the UNpop
data frame.## extract values 5 through 7 of the world.pop variable
UNpop[5:7, "world.pop"]
grade_code()
submission_ui
submission_server()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.