knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

datamake

Build Status

The purpose of datamake is to automatically create Makefiles for data analysis projects in R. It assumes a linear pipeline where scripts take data as input and return data as output.

Get it from GitHub:

devtools::install_github("jchrom/datamake")

It relies on base R parser and Kirill Müller's MakefileR.

Usage

To make a Makefile in your working directory, call:

datamake::make_file()

Only files specified by data reading/writing functions such as load, read.csv, save, saveRDS etc will be included in the Makefile as targets/prerequisites. These functions have to be called with the file argument matched by name, and its value must be provided as a character string, like so:

# Good
dat = readRDS(file = "myfile.RData")
write.csv(dat, file = "myfile.csv")

# Bad - won't appear in the Makefile
dat = readRDS("myfile.RData")
write.csv(dat, "myfile.csv")

# Also bad - won't appear in the Makefile
myfile.name = "myfile.csv"
write.csv(dat, file = myfile.name)

If it looks like some files are being left out, check for file arguments matched by order instead of name, and/or provided as objects other than character strings.

How

datamake parses the code in your working directory and searches for input/output calls. Then it extracts the values of the file argument which typically contain the path to the file being read/written. The resulting file list is broken down to create dependencies and targets for the Makefile.

When parsing Rmd files, the output file name is included as a special call write.itself in the target list. This ensures that the report is rendered upon the change of the source Rmd file (or its prerequisites).

You can change what functions datamake considers when searching for input/output calls as well as the name of the argument (though file seems like a sensible default).

You can also change what file types are scanned for code and how they will be run on the command line. Do this by providing a named list of file types and character strings, e.g.:

script = list(R = "Rscript \"$<\"")
# (this takes advantage of the fact that the script which contains
# the code is always listed as a first prerequisite.)

Why

Because life is too short to write Makefiles.



jchrom/datamake documentation built on May 18, 2019, 10:23 p.m.