les <- 6 knitr::opts_chunk$set(echo = TRUE, class.source="Rchunk", class.output="Rout")
We will focus on R package structure and development in this lesson.
Note that class time this week will be spend on the practice job interviews.
R packages are at the hart of the success that R has in the current Data Science field. Because code, data, models and everything you can basically think of that is needed to disseminate tools, can be packaged into an R package. It is the core structure for every analysis. When you want to share your R work (and why putting in the effort if you don't share it!), you will need to learn about R packages. The single unit of sharing things with future you or others is best done using an R packages structure. Therefore, I have two important tips:
As you go along, you can extend the package and build everything you do in that project into an R package. To help you with that, there is a very handy helper-package called {usethis}
. This package automates the most important parts of creating, extending and maintaining an R package. We will see the most important functions of {usethis}
in action during this lesson.
The most basic package consists of just one elementary file called the DESCRIPTION
file. This is the file that holds valuable information about:
Package
field)Title
field)Version
field)Authors@R
field)Description
field)Depends
, Imports
and Suggests
fields)SystemRequirements
field)To learn more about how R packages, we will look at a demo during an exercise:
Perform the exercises below, after you have completely finished the 'Whole Game' R packages demo and included all the functionality in your version of the {foofactors}
package
In a normal day-to-day data science routine you would probably not start the way your started above for the demo. Probably, you already have some R Markdown files available or loose scripts that define functions, download data or do something else. That is why I want to introduce you to an alternative approach. I call this approach: "Start with RMarkdown". During one of the Utrecht University R cafe meetings, I held a demo on this.
Imagine you are building up an analysis in an RMarkdown file. The analysis includes the following steps:
If you adopt a proper workflow in R, you probably will have written a number of functions to complete the steps above. Furthermore, you will probably want to save the cleaned, tidy version of the data as a file. And, most importantly, you will want to document the cleaning steps from raw to celan data, the analysis steps and the packages you used (dependencies) for your analysis. This is important in relationship to reproducibility. Future you or others will want to be able to rerun the analysis, with the same results or review the steps you took to get from the raw tot the cleaned data. Once you have finished writing this RMarkdown, plus all the depending scripts, you could stop there. But I recommend that you keep on working, to create an R package from your RMarkdown. The biggest advantage is that you will end up with a package, for which you already have a so-called vignette: It is the very RMarkdown you started with when you decided to build the package in the first place! The package will include:
This approach is also called: "RMarkdown driven development". The advantage is that now you do not only have an RMarkdown and an analysis, but you have it in a stable an sharable unit - as an R package - that you can ship, share and build upon.
To show you how this "Start with RMarkdown" works, I will point you to some excellent resources on this topic. You will need to study and understand these materials in order to successfully complete this lesson's portfolio assignment.
There are very many good references on the internet that you can consult to learn more:
{usethis}
blogpost In the assignment below you will create your own R package, related to work you have previously done in the programming courses. The final end-result is a personal package (publish the link to your package in your portfolio), that includes functionality that you can use in later analysis. This assignment needs you to be creative and proactive.
Maak nu opdracht 6 van de portfolio-opdrachten.
CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Unless it was borrowed (there will be a link), in which case, please use their license.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.