knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette explains how to use modules outside of R packages as a means to organize a project or data analysis. Using modules we may gain some of the features we also expect from packages but with less overhead.
A lot of R projects run into problems when they grow. Even relatively simple
data analysis projects can span a thousand lines easily. R has two important
building blocks to organize projects: functions and packages. However packages
do present a hurdle for a lot of users with little programming background. In
those cases we often rely on splitting up the code base into files and source
them into our R session (referring to the function source
). Modules, in this
context, present a more sophisticated way to source files by providing three
important features:
You can load scripts as modules when you refer to a file (or directory) in a
call to use
. Inside such a script you can use import
and use
in the same
way you typically use library
. Consider the following example where we create
a module in a temporary file with its dependencies.
code <- " import('stats', 'median') functionWithDep <- function(x) median(x) " fileName <- tempfile(fileext = ".R") writeLines(code, fileName)
Then we can load such a module into this session by the following:
library(modules) m <- use(fileName) m$functionWithDep(1:2)
To give a bit more context of how you can structure a project, consider the following file structure:
/ /R munging.R graphics.R /data some.csv /results /tables ... /figs main.R README.md
You put all your R code into the R
folder. This folder may or may not have a
nested folder structure itself. You probably have a folder for your data and one
into which you store all results. The important part here is that you have split
your code base into different files. main.R
in the project root acts as the
master file in this example. This file kicks of all steps of our analysis and
connects the dots. munging.R
and graphics.R
implement helper functions.
main.R
lib <- modules::use("R") dat <- read.csv("data/some.csv") # munging dat <- lib$munging$clean(dat) dat <- lib$munging$recode(dat) # generate results lib$graphics$barplot(dat) lib$graphics$lineplot(dat)
The main.R
file implements no logic of the analysis. Its responsibility is to
connect all steps. Each file in the R
folder then implements a phase of the
project. In larger projects it is likely that each phase will need its own
folder. The implementation may then look something along the lines of:
R/munging.R
export("clean") clean <- function(dat) { # ... } export("recode") recode <- function(dat) { # ... } helper <- function(...) { # This function is private # ... }
R/graphics.R
import("ggplot2") export("barplot", "lineplot") barplot <- function(dat) { # ... } lineplot <- function(dat) { # ... } helper <- function(...) { # ... }
If you want proper documentation for your functions or modules you really want a package. There are some simple things you can do for ad-hoc documentation of modules which is to use comments:
module({ fun <- function(x) { ## A function for illustrating documentation ## x (numeric) some values x } })
library
, attach
or source
inside of modules. It is likely
that they do not do what you want. import
and use
are to be preferred in
this context.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.