In DavisBrian/rclassnotes: Does not matter.

set.seed(31415)
options(digits = 3)

knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE,
  cache = FALSE,
  out.width = "70%",
  fig.align = 'center',
  fig.width = 6,
  fig.asp = 0.618,  # 1 / phi
  fig.show = "hold"
)

options(dplyr.print_min = 6, dplyr.print_max = 6)
Sys.setenv(LANGUAGE = "en")

Good practices

"When you write a program, think of it primarily as a work of literature. You're trying to write something that human beings are going to read. Don't think of it primarily as something a computer is going to follow. The more effective you are at making your program readable, the more effective it's going to be: You'll understand it today, you'll understand it next week, and your successors who are going to maintain and modify it will understand it."

-- Donald Knuth

Coding style

Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread. When I answer questions; first, I see if think I can answer the question, secondly, I check the coding style of the question and if the code is too difficult to read, I just move on. Please make your code readable by following e.g. this coding style (most examples below come from this guide).

"To become ssignificantly more reliable, code must become more transparent. In particular, nested conditions and loops must be viewed with great suspicion. Complicated control flows confuse programmers. Messy code often hides bugs."

--- Bjarne Stroustrup

Comments

In code, use comments to explain the “why” not the “what” or “how”. Each line of a comment should begin with the comment symbol and a single space: #.

```{block2, type='rmdtip'} Use commented lines of - to break up your file into easily readable chunks and to create a code outline in RStudio

### Naming

> There are only two hard things in Computer Science: cache invalidation and naming things.
>
> -- Phil Karlton

Names are not limited to 8 characters as in some other languages, however they are case sensitive.  Be smart with your naming; be descriptive yet concise. Think about how your names will show up in auto complete.

Throughout the course we will point out some standard naming conventions that are used in R (and other languages).  (Ex. `i` and `j` as row and column indices)

```r
# Good
average_height <- mean((feet / 12) + inches)
plot(mtcars$disp, mtcars$mpg)

# Bad
ah<-mean(x/12+y)
plot(mtcars[, 3], mtcars[, 1])

Spacing

Put a space before and after = when naming arguments in function calls. Most infix operators (==, +, -, <-, etc.) are also surrounded by spaces, except those with relatively high precedence: ^, :, ::, and :::. Always put a space after a comma, and never before (just like in regular English).

# Good
average <- mean((feet / 12) + inches, na.rm = TRUE)
sqrt(x^2 + y^2)
x <- 1:10
base::sum

# Bad
average<-mean(feet/12+inches,na.rm=TRUE)
sqrt(x ^ 2 + y ^ 2)
x <- 1 : 10
base :: sum

Indenting

Curly braces, {}, define the the most important hierarchy of R code. To make this hierarchy easy to see, always indent the code inside {} by two spaces.

# Good
if (y < 0 && debug) {
  message("y is negative")
}

if (y == 0) {
  if (x > 0) {
    log(x)
  } else {
    message("x is negative or zero")
  }
} else {
  y ^ x
}

# Bad
if (y < 0 && debug)
message("Y is negative")

if (y == 0)
{
    if (x > 0) {
      log(x)
    } else {
  message("x is negative or zero")
    }
} else { y ^ x }

Long lines

Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If you find yourself running out of room, this is a good indication that you should encapsulate some of the work into a separate function.

If a function call is too long to fit on a single line, use one line each for the function name, each argument, and the closing ). This makes the code easier to read and to change later.

# Good
do_something_very_complicated(
  something = "that",
  requires  = many,
  arguments = "some of which may be long"
)

# Bad
do_something_very_complicated("that", requires, many, arguments,
                              "some of which may be long"

Other

Use <-, not =, for assignment. Keep = for parameters.

# Good
x <- 5
system.time(
  x <- rnorm(1e6)
)

# Bad
x = 5
system.time(
  x = rnorm(1e6)
)

Don't put ; at the end of a line, and don't use ; to put multiple commands on one line.
Only use return() for early returns. Otherwise rely on R to return the result of the last evaluated expression.

# Good
add_two <- function(x, y) {
  x + y
}

# Bad
add_two <- function(x, y) {
  return(x + y)
}

Use ", not ', for quoting text. The only exception is when the text already contains double quotes and no single quotes.

# Good
"Text"
'Text with "quotes"'
'<a href="http://style.tidyverse.org">A link</a>'

# Bad
'Text'
'Text with "double" and \'single\' quotes'

Coding practices

Variables

Create variables for values that are likely to change.

Rule of Three[^DRY]

Try not to copy code, or copy then modify the code, more than twice.

If a change requires you to search/replace 3 or more times make a variable.
If you copy a code chunk 3 or more times make a function
If you copy a function 3 or more times make your function more generic
If you copy a function into a project 3 or more times make a package
If 3 or more people will use the function make a package

The Rule of Three applies to look-up tables and such also. The key thing to think about is; if something changes how many touch points will there be? If it is 3 or more places it is time to abstract this code a bit.

[^DRY]: This is sometimes called the DRY principle, or Don't Repeat Yourself.

Path names

It is better to use relative path names instead of hard coded ones. If you must read from (or write to) paths that are not in your project directory structure create a file name variable at the highest level you can (always end with the /) and then use relative paths.
DO NOT EVER USE setwd()

# Good
raw_data <- read.csv("./data/mydatafile.csv") 

input_file <- "./data/mydatafile.csv"
raw_data <- read.csv(input_file)  

input_path <- "C:/Path/To/Some/other/project/directory/"
input_file <- paste0(input_path, "data/mydatafile.csv")
raw_data <- read.csv(input_file)

# Bad
setwd("C:/Path/To/Some/other/project/directory/data/")
raw_data <- read.csv("mydatafile.csv")
setwd("C:/Path/back/to/my/project/")

RStudio

Download the latest version of RStudio (> 1.1) and use it!

Learn more about new features of RStudio v1.1 there.

RStudio features:

everything you can expect from a good IDE
keyboard shortcuts I use frequently
1. Ctrl + Space (auto-completion, better than Tab)
2. Ctrl + Up (command history & search)
3. Ctrl + Enter (execute line of code)
4. Ctrl + Shift + A (reformat code)
5. Ctrl + Shift + C (comment/uncomment selected lines)
6. Ctrl + Shift + / (reflow comments)
7. Ctrl + Shift + O (View code outline)
8. Ctrl + Shift + B (build package, website or book)
9. Ctrl + Shift + M (pipe)
10. Alt + Shift + K to see all shortcuts...
Panels (everything is integrated, including Git and a terminal)
Interactive data importation from files and connections (see this webinar)
Use code diagnostics:
R Projects:
- Meaningful structure in one folder
- The working directory automatically switches to the project's folder
- File tab displays the associated files and folders in the project
- History of R commands and open files
- Any settings associated with the project, such as Git settings, are loaded. Note that a set-up.R or even a .Rprofile file in the project's root directory enable project-specific settings to be loaded each time people work on the project.

The only two things that make \@JennyBryan 😤😠🤯. Instead use projects + here::here() #rstats pic.twitter.com/GwxnHePL4n
— Hadley Wickham (\@hadleywickham) December 11 2017

Read more at https://www.tidyverse.org/articles/2017/12/workflow-vs-script/ and also see chapter Efficient set-up of book Efficient R programming.

Getting help

Help yourself, learn how to debug

A basic solution is to print everything, but it usually does not work well on complex problems. A convenient solution to see all the variables' states in your code is to place some browser() anywhere you want to check the variables' states.

Learn more with this book chapter, this other book chapter, this webinar and this RStudio article.

External help

Can't remember useful functions? Use cheat sheets.

You can search for specific R stuff on https://rseek.org/. You should also read documentations carefully. If you're using a package, search for vignettes and a GitHub repository.

You can also use Stack Overflow. The most common use of Stack Overflow is when you have an error or a question, you Google it, and most of the times the first links are Q/A on Stack Overflow.

You can ask questions on Stack Overflow (using the tag r). You need to make a great R reproducible example if you want your question to be answered. Most of the times, while making this reproducible example, you will find the answer to your problem.

Join the R-help mailing list. Sign up to get the daily digest and scan it for questions that interest you.

Keeping up to date

With over 10,000 packages on CRAN it is hard to keep up with the constantly changing landscape. R-Bloggers is an R focused blog aggregation site with dozens of posts per day. Check it out.

Reading For Next Class

Read the chapter on Workflow: basics
Read the chapter on Workflow: scripts
Read the chapter on Workflow: projects
Read Chapters 1-3 of the Tidyverse Style Guide
See these RStudio Tips & Tricks or these and find one that looks interesting and practice it all week.
Read how to make a great R reproducible example

Exercises

Create an R Project for this class.
Create the following directories in your project (tip sheet?)
- Bonus points if you can do it from R and not RStudio or Windows Explorer
- Double Bonus points if you can make it a function.
- Hint: In the R console type file and scroll through the various functions which appear in the pop-up.
Copy one of your R scripts into your R directory. (Bonus points if you can do it from R and not RStudio or Windows Explorer)
Apply the style guide to your code.
Apply the "Rule of 3"
- Create variables as needed
- Identify code that is used 3 or more times to make functions
- Identify code that would be useful in 3 or more projects to integrate into a package.

DavisBrian/rclassnotes documentation built on May 17, 2019, 8:19 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DavisBrian/rclassnotes
Does not matter.

In DavisBrian/rclassnotes: Does not matter.

Good practices

Coding style

Comments

Spacing

Indenting

Long lines

Other

Coding practices

Variables

Rule of Three[^DRY]

Path names

RStudio

Getting help

Help yourself, learn how to debug

External help

Keeping up to date

Reading For Next Class

Exercises

R Package Documentation

Browse R Packages

We want your feedback!

DavisBrian/rclassnotes Does not matter.

In DavisBrian/rclassnotes: Does not matter.

Good practices

Coding style

Comments

Spacing

Indenting

Long lines

Other

Coding practices

Variables

Rule of Three[^DRY]

Path names

RStudio

Getting help

Help yourself, learn how to debug

External help

Keeping up to date

Reading For Next Class

Exercises

R Package Documentation

Browse R Packages

We want your feedback!

DavisBrian/rclassnotes
Does not matter.