The following are the DHSC sensible defaults for R:
The dominant IDE for R is Rstudio, which comes packaged with R. For a new project you should use the latest version of Rstudio available from the software portal.
Default to packages from the Tidyverse.These have been carefully designed to work together effectively as part of a modern data analysis workflow. More info can be found here: R for Data Science by Hadley Wickham.
For example:
%>%
rather than nesting function calls. (...but not always e.g. see here).purrr
to the apply
family of functions. See hereRecommended Packages:
Always work in a project. See the guide to Using Projects.
Projects functionality is broken in DHSC's packaged version of Rstudio - see the fix here
Packages are the fundamental unit of reproducible R code. Therefore, if possible, build an R Package to share and document your code.
Hadley's book on R Packages is an effective guide on how to produce a package.
The usethis package has lots of useful shortcuts for package builders.
There are two key competing ways of managing dependencies for an R Project:
packrat
- current established way to manage R dependenciesrenv
- rapidly maturing, successor to packrat.See also:
You may come across code which doesn't work because it depends on a different version of a package to the one you have.
Fortunately, Microsoft keep daily snapshots of CRAN and store them on the Microsoft R Application Network.
The checkpoint
package from Microsoft lets you use these snapshots to install packages as if it were any day since 2017-07-01.
Simply start your script with:
library(checkpoint) checkpoint(snapshotDate = "2015-01-15", checkpointLocation = getwd())
This will download and fetch all the packages as they existed on the given date and install them to a library on your home drive.
Notes:
BH
(a lot of tidyverse code will) then this will take some time!checkpointLocation
argument to tell checkpoint to use the C drive.Base R includes the try()
and tryCatch()
functions for handling errors. You can find an example of basic use of these on r-bloggers.
Effective error handling in R requires understanding the conditions system. There is a good chapter on this in Hadley's Advanced R book
If you are iterating over many inputs, it is recommended that you use the safely()
family of functions from purrr
to create versions which return errors within a list for handling at a later stage.
Use the testthat
package for performing unit tests.
For details see the 'tests' chapter of R Packages.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.