require(learnr)
library(printr)
knitr::opts_chunk$set(echo = TRUE)

Basic Operations in R

Introduction

The aim of the lab session is to teach you the basics of the R programming language. We will follow the first chapters of Hands-On Programming with R by Garret Grolemund, a free and open-source resource.

Those who may be interested in learning more advanced topics are invited to consult Advanced R, by Hadley Wickham (another free and open-source resource) or the official R documentation.

They see me rolling...

To introduce you to the language, we will develop a small project: we want to simulate the rolling of two dice.

The first thing we have to ask ourselves is therefore how to represent a die. We know for a fact that a common die is made of six faces. Each one of them has a number, going from $1$ to $6$.

We need to understand how to tell R that we wish to save some numbers. The first thing we notice is that we can directly type into the R console what we need.

*"Your computer does your bidding when you type R commands at the prompt in the bottom line of the console pane. Don’t forget to hit the Enter key. When you first open RStudio, the console appears in the pane on your left, but you can change this with File > Preferences in the menu bar."* (taken from Hands-On Programming with R)

We can try to type some basic operations as

2 + 2

You will notice the [1] that appears before you see the result. It is a way for R to tell what element of the solution is displayed. In the case of a single number, we only have one element.

Exercise 1

Type in the R console some basic operations and observe the results

Write the R code required to add two plus two:


Longer results

This change whenever we have, for instance, a vector as

91:115

Try it!


Incomplete expressions

If you type a command that is incomplete, R will prompt you with a + sign until a complete expression is formed:

15 / 

  3

Wrong expressions

At the same time, it will give you an error when it is not possible to obtain an expression that it can interpret.

3 % 5

Exercise 2

Let's play some magic To see if you've got the hang of how R works, let us play a game:

  1. Pick a number.
  2. Add $2$ to it.
  3. Multiply what you have obtained by $3$.
  4. Subtract $6$ from the answer.
  5. Divide what you get by $3$.
  6. Realise in awe you have obtained the number you had chosen.

Objects in R

The assignment operator

We are now ready to make a new step. We know how to manipulate numbers with basic operations. We want to find a way to tell R to save the results. To do so, we use the <- operator.

a <- 42
a

die <- 1:6
die

Rules of assignment

You can name objects however you may please, but bear in mind that names are case sensitive and that there are things you cannot do:

Reassigning a name

Also notice that R will overwrite objects without asking you, so that

a <- 42
a

a <- 49
a

Notice that in the environment pane of RStudio, objects that you create are added to the list.

Checking the objects

To see what objects you have created you can type ls().

ls()

Some vector algebra

Introduction

Everybody loves algebra and I am sure you do too. Let us see some basic operations.

die - 1
die / 2
die ** 2
exp(die)

Element-Wise Operations

I am sure you have noticed that something seemingly bizarre has happened. R has expanded the scalars to vectors and then computed the element-wise operation we have told it. To see it, let us consider what happens when we do die * die.

die * die

Vector Recycling

What happens if we use two vectors of different dimensions? The second vector will be expanded. This feature is called vector recycling. Let us see an example.

die + 1:2

Vector Recycling: incompatible dimensions

R will prompt you a warning message whenever the dimensions are not compatible.

die + 1:5

Standard inner/outer products

The traditional inner and outer products can be obtained by doing %*% and %o%.

die %*% die
die %o% die

Functions

Introduction to functions

R comes with a number of functions that are useful to carry out a number of tasks.

round(3.14159265359)
mean(die)
factorial(5)
log10(10)
log(1)

When we call more functions on each other, they are executed from the inside out.

round(mean(die + 1))
die + 1
mean(die + 1)
round(mean(die + 1))

Rolling a dice

This is very convenient for us. We are interested in selecting a random face of a die. The function sample is what we need.

sample(die)
sample(die)
sample(die)

It looks like something is out of place. We obtain an entire shuffling of the data. Maybe we need to have a closer look at the sample function.

Understanding a function

To better understand a function, say foo, we use the help function to read the documentations. The args function returns the arguments of foo.

args(sample)
help(sample, help_type="text")

The arguments of sample

Notice that sample takes four arguments: x, size, replace, prob:

 # Use the function to obtain one random element of the die.
sample(die, size = 1)

Passing values to functions

Notice what we have done here. We have called a function passing some arguments:

In R, it is possible to pass arguments by explicitly setting them (as we did we size = 1) and by passing them in order. We also see that after calling args(sample), we see that the last two arguments already have a value: replace = FALSE and prob = NULL. Such arguments are given a default value: if you do not specify their value, R will pass the default ones.

As a side, notice that any text preceded by # is a comment and will be ignored by R. This gives you an opportunity to document your code, use it!

Writing our first function

We are now interested in writing our first function. The function should return the sum of two dice being rolled.

Function constructor

The syntax to write a function is quite straightforward.

my_function <- function() {}

In this case, we have called the function my_function. The function constructor is then followed by brackets (): here we can pass the arguments. Between the curly brackets {} we put the body of the function: the set of instructions our function should compute.

First step

Let us try to write the first bit.

roll <- function() {
  # This function defines a die of 6 faces and print two values.
  die <- 1:6
  extraction <- sample(die, size = 2)
  print(extraction)
}
roll()
roll()
roll()
roll()

Pretty neat, right?

Try to do it again, this selecting one value!


Are we done?

Each time a function is called, R creates a new environment where to perform the calculation. What is an environment? We may think of it as a box where computations are made. Each box knows in which bigger box it is contained and looks there to find the arguments you are passing. At the same time, each function can only modify the objects of its own box.

So, how do we give to the outer box the values we have computed? Easy! Just append it as the last line of the function and R will take care of communicating the result.

Try to complete the function so that it returns the sum of the two values, using the sum function.

roll <- function() {
  # This function defines a die of 6 faces and print two values.
  die <- 1:6
  extraction <- sample(die, size = 2, replace = TRUE)

}
roll <- function() {
  # This function defines a die of 6 faces and print two values.
  die <- 1:6
  extraction <- sample(die, size = 2, replace = TRUE)
  sum(extraction)
}

Are we done, now?

Everything seems to be working fine. What if, however, we want our function to be more general and take as input an indefinite number of dice with an indefinite number of faces? We need to pass some arguments!

roll <- function(die, number_die) {
  extraction <- sample(die, size = number_die, replace = FALSE)
  sum(extraction)
}

die <- 1:6
number_die <- 2
print(roll(die, number_die))
print(roll(die, number_die))
print(roll(die, number_die))
print(roll(die, number_die))

It looks great, but for a fact. Look at what happens now if we just call roll.

roll()

Can you fix it?

roll <- function(die, number_die) {
  extraction <- sample(die, size = number_die, replace = FALSE)
  sum(extraction)
}
roll <- function(die = 1:6, number_die = 2) {
  extraction <- sample(die, size = number_die, replace = FALSE)
  sum(extraction)
}

Scripts in R

So, after all this work it may be worth it to save our function and use it in the future. How can we do so? RStudio allows you to create R script. An R script is a set of instructions that can be stored as a .R file and executed at a later time. It is a very convenient feature, as it allows to store your (well-documented) code. To crate a new script, you can either open it manually in RStudio or press Ctrl+Shift+N on Windows and Cmd+Shift+N on MacOS.

Loading and Storing objects in R

During the R session, objects are created and stored by name. As we have seen, to retrieve all the objects that are currently stored, it is possible to use the ls() command.

Moreover, objects can be removed using the rm() function.

For instance,

rm(x, y, z, ink, junk, temp, foo, bar)

will remove all the objects with those names. At the end of each session, R will ask you if you wish to store into a file the objects currently part of your session. Such objects will be stored to a file called .RData, in the current working directory. Moreover, the command lines that you have used will be stored into a .RHistory file.

When R is later restarted, it reloads the workspace from the file, as well as the associated commands history.

However, it is important to notice that saving .RData is increasingly seen as a bad practice. The reason for this is simple. Imagine that you have performed a statistical analysis on some data and that you are relying for the analysis on an object you have not explicitly saved. Nobody else is going to be able to run your code!

Explicit is almost always better than implicit!

Conclusion

Congratulations! You have moved your very first steps using the R programming language.



mascaretti/moxier documentation built on March 17, 2020, 3:57 p.m.