In IBAHCM/RPiR: Reproducible Programming in R

library(learnr)
tutorial_options(exercise.reveal_solution = FALSE)
gradethis::gradethis_setup()

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = ">",
  error = TRUE
)
set.seed(123)

Naming of things

Dots and underscores

You may have noticed that my functions in A-2 were called things like add_one (with an underscore), but my variables had names like one.more (with a dot). This was deliberate. Variables and functions should be made up of one or more words (or abbreviations) to make it easy to understand what they do. However, it can be hard to read names just stuck together like addtwonumberstogether, whereas add_two_numbers_together is easier. However, there are boring reasons to do with how R works why it's a bad idea to write R functions with dots in their names (even though many functions in R already have dots). As a result, in these practicals, I insist that you don't uses dots in function names when you write them.

However, dots are generally okay in variables names, so to make it easy to distinguish normal variables from functions, I choose to put dots in variable names (as one.more). You can do whatever you like.

Here's a function that takes two arguments and adds them together (yes, it is just reinventing +!):

add.up <- function(first, second) {
  added_together <- first + second
  added_together
}

add.up(10, 30)

add_up <- function(first, second) {
  added.together <- first + second
  added.together
}

add_up(10, 30)

grade_this_code()

Rename the function and variables in Exercise 2 so that it satisfies my naming rules.

There is much more on how to style your code to make it more consistent here. Hadley recommends never using dots, so using underscores to separate words in functions and variables. If you want to do that it's fine. Don't use dots in function names though. He has lots of other suggestions about how to write clear code - do have a look at it and try to follow it if you have time.

Not reusing names

In Exercise 1, we showed that variables used in functions don't affect what happens outside the function. Unfortunately, the reverse isn't true. Look at this function:

missing_add <- function(first) {
  first + second
}

missing_add(10)

This function was intended to be like add_up() in Exercise 2, but I accidentally forgot to add the second argument. Fortunately if you run it, it will give an error (try it). However, what happens if you have defined a variable called second elsewhere in your script? Try adding the following at line 4:

second <- 1

missing_add(10) should no longer give an error, and return 11. Try changing second to 2 - missing_add(10) should now return 12. That's not good news - what the function does depends on what else you may have done in your script. When you call a function it should always do the same if you give it the same arguments.

This gets even worse when you account for our ability to misspell things:

misspelled_add <- function(first, second) {
  resuIt <- first + second
  result
}

result <- -5
misspelled_add(10, 20)
misspelled_add(1, 2)

**Hint:** If you can't see the problem see how the word `result` is spelled in different places.

You should find that misspelled_add() gives the wrong answer even though the code looks right (if you don't look too closely). This problem can (and does!) cause hours of agony to some students every year on this course. As a result, we insist on two things in your submitted practicals:

You use different variable names in your functions than in your scripts, wherever possible. Sometimes there really only is one sensible name for a variable (maybe timestep when describing the increments in your simulation model), and you should be very careful when that happens, but normally there are at least two ways of describing the same idea - often one more general in a function (birth.rate) which can be used in multiple contexts, and then specific in your main script each time you use it (human.annual.birth.rate, for instance).
You write checks for your functions to make sure they aren't accidentally missing any variables or arguments. There are two tools for this, findGlobals() and checkUsage() in the codetools library.

Try adding the following lines to the end of Exercise 4:

library(codetools)
checkUsage(misspelled_add)

You'll see that it reports on the problem that you have probably already identified. It can also identify other problems, and we recommend it to you. However, it doesn't return anything you can use in your code to allow us to automatically reject bad functions, and we'll show you how to do this in a later practical. For now just try using this line instead of checkUsage() above (the library() call is the same):

findGlobals(misspelled_add, merge = FALSE)

You'll see that it identifies the use of the global variable result too (if less elegantly). We'll show you how to use this to automatically detect problems with variable misnaming in a later exercise, but for now just remember that you need to watch out for misspelling very carefully.

If you're interested, the findGlobals() output also tells you why this problem exists in R. You'll see that the function misspelled_add() needs to use three other global variables - the functions {, + and <- - to get the function to work. As a result of the way R works, functions have to be able to see what is going on outside to be able to work at all. Most languages handle this more elegantly.

Functions are covered in R4DS here and R Coder covers them in some depth here. Advanced R covers functions in enormous detail starting here with an introduction to functional programming, and continuing for several chapters.

IBAHCM/RPiR documentation built on Jan. 12, 2023, 7:41 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com