biocthis developer notes

knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>",
    crop = NULL ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html
)
## Track time spent on making the vignette
startTime <- Sys.time()

## Bib setup
library("RefManageR")

## Write bibliography information
bib <- c(
    R = citation(),
    BiocStyle = citation("BiocStyle")[1],
    biocthis = citation("biocthis")[1],
    covr = citation("covr")[1],
    devtools = citation("devtools")[1],
    fs = citation("fs")[1],
    glue = citation("glue")[1],
    knitr = citation("knitr")[1],
    pkgdown = citation("pkgdown")[1],
    rlang = citation("rlang")[1],
    RefManageR = citation("RefManageR")[1],
    rmarkdown = citation("rmarkdown")[1],
    sessioninfo = citation("sessioninfo")[1],
    styler = citation("styler")[1],
    testthat = citation("testthat")[1],
    usethis = citation("usethis")[1]
)

Note that r Biocpkg("biocthis") is not a Bioconductor-core package and as such it is not a Bioconductor official package. It was made by and for Leonardo Collado-Torres so he could more easily maintain and create Bioconductor packages as listed at lcolladotor.github.io/pkgs/. Hopefully r Biocpkg("biocthis") will be helpful for you too.

Basics

For the basics, please check the Introduction to biocthis vignette.

biocthis developer notes

Backstory

In 2019, I was able to take the "Building Tidy Tools" workshop taught by Charlotte and Hadley Wickham during rstudio::conf(2019) thanks to a diversity scholarship. During this workshop, I learned about r CRANpkg("usethis") r Citep(bib[["usethis"]]), r CRANpkg("devtools") r Citep(bib[["devtools"]]), r CRANpkg("testthat") r Citep(bib[["testthat"]]), among other R packages, and how to use RStudio Desktop to create R packages more efficiently. I got to revise this material and practice it more for the CDSB Workshop 2019: How to Build and Create Tidy Tools where we re-used the materials (with their permission) and translated them to Spanish. Over the years I have made several Bioconductor R packages that I maintain. Yet I learned a lot thanks to Charlotte and Hadley and have been relying more and more on r CRANpkg("usethis") and related packages.

Earlier this year (2020) one of my Bioconductor packages (r Biocpkg("regionReport")) was presenting some errors on some operating systems but not on others. I first spent quite a bit of time setting up the corresponding R installation in my non-work Windows computer. I still struggled to reproduce the error, so I finally learned how to use the Bioconductor docker images. That is, run the following code to then have an environment with all the system dependencies installed for Bioconductor packages. In this system you can then install your package dependencies and get very close to the Linux environment machine used for testing Bioconductor packages.

```{bash, eval = FALSE} docker run \ -e PASSWORD=bioc \ -p 8787:8787 \ bioconductor/bioconductor_docker:devel

Using this docker image, I was finally [able to reproduce the error](https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016532.html) which involved others Bioconductor packages. However, there was a second hard-to-reproduce error. Using [GitHub Actions](https://github.com/features/actions), which I'll talk about more soon, I was then able to find the [root cause of this second issue](https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016650.html) and resolve it.

`r Biocpkg("biocthis")` `r Citep(bib[['biocthis']])` was born from my interest to keep using `r CRANpkg("usethis")` and related tools, but in a Bioconductor-friendly way. That is, this is a package that will help me (and maybe others too). This package was born from these 5 issues:

* _Bioconductor-friendly R CMD check action feature suggestion_ [r-lib/actions#84](https://github.com/r-lib/actions/issues/84)
* _Bioc-friendly feature suggestions_ [r-lib/usethis#1108](https://github.com/r-lib/usethis/issues/1108)
* _Bioc-friendly style feature suggestion_ [r-lib/styler#636](https://github.com/r-lib/styler/issues/636)
* _Recommend `styler` over `formatR` suggestion_ [Bioconductor/BiocCheck#57](https://github.com/Bioconductor/BiocCheck/issues/57)
* _GitHub actions and styler suggestions_ [Bioconductor/bioconductor.org#54](https://github.com/Bioconductor/bioconductor.org/issues/54)


## Styling code

`r Biocpkg("BiocCheck")` is run on all new Bioconductor package submissions and by default it checks whether the new package adheres to the [Bioconductor coding style guide](http://bioconductor.org/developers/how-to/coding-style/). For a long time, it has suggested `r CRANpkg("formatR")` as a solution for automatically styling code in an R package. While `r CRANpkg("formatR")` mostly works and I've used it before, I recently discovered `r CRANpkg("styler")` which can be used for styling code to fit the [tidyverse coding style guide](https://style.tidyverse.org/). On my own packages, I have found `r CRANpkg("styler")` to be superior to `r CRANpkg("formatR")` because it: 

* breaks less code,
* can format `r CRANpkg("roxygen2")` example code,
* and can re-format R Markdown files like vignettes.
* Plus it seems to me to be under active maintenance, which is always a good thing.

Several of the issues I made are related to using `r CRANpkg("styler")` to automatically re-format your code to match more closely the Bioconductor coding style guide. That is how `bioc_style()` was born and it was the suggested approach as discussed at _Bioc-friendly style feature suggestion_ [r-lib/styler#636](https://github.com/r-lib/styler/issues/636). The maintainer of `r CRANpkg("styler")`, [Lorenz Walthert](https://twitter.com/lorenzwalthert), has a great reply on that issue linking for a more detailed discussion on how to expand `r CRANpkg("styler")` if the job requires doing so.

Currently, `bioc_style()` does not fully replicate the Bioconductor coding style, but it gets close enough. As [Martin Morgan](https://twitter.com/mt_morgan) said at _Recommend `styler` over `formatR` suggestion_ [Bioconductor/BiocCheck#57](https://github.com/Bioconductor/BiocCheck/issues/57), a solution that gets 90% of the way is good enough. `bioc_style()` is a very short function, mostly because the Bioconductor and Tidyverse coding style guides are overall very similar. This function won't solve all the formatting issues detected by `r Biocpkg("BiocCheck")`, but if you really want to, you can disable the formatting checks with:

```r
## Use the following for the latest options
BiocCheck::usage()
## Disable formatting checks
BiocCheck::BiocCheck(`--no-check-formatting` = TRUE)

GitHub Actions

Motivation

I have been using Travis CI for several years now to help me run R CMD check every time I make a commit and push it to GitHub. Travis CI has mostly worked well for me, though I frequently had to maneuver around the 50 minute limit. I also recently ran into a problem where Hadley Wickham replied "We now recommend using the github actions workflow instead; which avoids all this configuration pain". I also ran into a problem that didn't always happen in Travis CI but that was potentially related to the computational resources provided (memory). I heard the term GitHub Actions at rstudio::conf(2020) but I ended up missing Jim Hester's talk which you can watch online: I highly recommend it and wish I had started my adventure into GitHub Actions with it. Briefly, GitHub Actions allows you to run checks on Windows, macOS or Linux for up to 6 hours on machines with 7 GB of RAM. That's two more operating systems than what I was using with Travis CI, a significant amount longer of time, and a decent chunk of memory.

The significance of these 3 operating systems is important to me because Bioconductor runs nightly checks on those 3 platforms. It's a great way to know if your Bioconductor R package will work for most users. However, you only get one report per day. If you are not the most organized person like me, and have to fix your code before a release, then you don't have as many days to check your R package(s) and need more frequent feedback. So I've been looking for a way to run checks on all three platforms on demand. Bioconductor has a Single Package Builder which does this, but it is restricted to new package submissions.

I know that there's AppVeyor for running checks on Windows, but I never used it. Travis CI does support macOS and Linux. In the past, I have used r Githubpkg("r-hub/rhub") and I was able to run tests on a package using a combination of Travis CI and rhub as detailed at r-hub/rhub/issues#52. rhub maintainers have also taken steps to support Bioconductor's release cycle as described at r-hub/rhub/issues#38. Regardless of the platform, it would ultimately be nice to have a single configuration file that you (the package developer) don't need to update for every Bioconductor release cycle.

Developing a Bioconductor-friendly GHA workflow

I saw on Twitter the announcement about GitHub Actions in r CRANpkg("usethis") and that is when I started to look more into r CRANpkg("usethis") and r Githubpkg("r-lib/actions") by Jim Hester, particularly r-lib/actions/examples. As my usual, I tried to just get it to work and then had to look more closely at the documentation and the code. Naively, I thought that I could make r-lib/actions/examples/check-standard.yaml Bioconductor-friendly, which Jim Hester immediately recognized as a complicated task. As you can see at Bioconductor-friendly R CMD check action feature suggestion r-lib/actions#84 this took a while. When working on this, I also looked at several other resources and real world examples:

Most of the development of the Bioconductor-friendly GitHub Actions workflow provided by r Biocpkg("biocthis") was done with leekgroup/derfinderPlot/.github/workflows/check-bioc.yml and LieberInstitute/recount3/.github/workflows/check-bioc.yml as detailed at: Bioconductor-friendly R CMD check action feature suggestion r-lib/actions#84. It was then further improved by a pull request with tests carried out at lcolladotor/testmatrix.

This work eventually lead to use_bioc_github_action() as it is today. The features of this GHA workflow are described in the Introduction to biocthis vignette. Going back to the story about developing this GHA workflow, while working on this GHA workflow, I ran into several issues and I wouldn't be surprised if we run into more of them later on.

Potential future additions

Wrapping up

The resulting Bioconductor-friendly GitHub Actions workflow that you can add to your package with biocthis::use_bioc_github_action() has many comments which you might find helpful for understanding why some steps are done the way they are. I have tried to simplify the workflow when possible, but it depends on the latest version of many tools and thus will expose you to issues you might have not dealt with, particularly compilation issues of R packages with R-devel (six months of the year with the current Bioconductor release cycle). If you need help, start by going through the steps listed at r-lib/actions#where-to-find-help. r Biocpkg("biocthis") exclusive issues are always welcome, though please include the information that will enable others to help you faster. Thank you!

usethis-like functions

r Biocpkg("biocthis") also provides other r CRANpkg("usethis")-like functions. To make these functions, I looked at the code inside r CRANpkg("usethis") and learned how to make templates, how the data is passed to the templates and some other steps. Some of the functions are really identical to the ones from r CRANpkg("usethis") but point to a custom template provided by r Biocpkg("biocthis"). These functions have simplified for me the task of having uniform README.Rmd/md and vignette files for instance, as well as having GitHub issue & support templates that include some Bioconductor-specific information and some of my own personal preferences for asking for help. I also included template R scripts through use_bioc_pkg_templates() that is an idea I first learned at rstudio::conf(2020) on the r CRANpkg("golem") package. Those scripts are useful to keep track of code that you had to run to make the R package or to update it later. These scripts can greatly jump-start your R/Bioconductor package creation process. So maybe you'll see more packages by me and others soon =) In particular, I really hope that we can get more CDSB members to submit R/Bioconductor packages to the world as explained in this story, which is something I care about quite a bit.

Acknowledgments

I just want to thank everyone for helping me understand different pieces of code, for producing the tools I used, for interacting with me across many GitHub issues, as well as answering questions on multiple mailing lists. The names below are in order they appear in this vignette:

as well as several organizations and members:

Thank you very much! 🙌🏽😊

Reproducibility

The r Biocpkg("biocthis") package r Citep(bib[['biocthis']]) was made possible thanks to:

This package was developed using r BiocStyle::Githubpkg('lcolladotor/biocthis').

Code for creating the vignette

## Create the vignette
library("rmarkdown")
system.time(render("biocthis_dev_notes.Rmd", "BiocStyle::html_document"))

## Extract the R code
library("knitr")
knit("biocthis_dev_notes.Rmd", tangle = TRUE)

Date the vignette was generated.

## Date the vignette was generated
Sys.time()

Wallclock time spent generating the vignette.

## Processing time in seconds
totalTime <- diff(c(startTime, Sys.time()))
round(totalTime, digits = 3)

R session information.

## Session info
library("sessioninfo")
options(width = 120)
session_info()

Bibliography

This vignette was generated using r Biocpkg('BiocStyle') r Citep(bib[['BiocStyle']]) with r CRANpkg('knitr') r Citep(bib[['knitr']]) and r CRANpkg('rmarkdown') r Citep(bib[['rmarkdown']]) running behind the scenes.

Citations made with r CRANpkg('RefManageR') r Citep(bib[['RefManageR']]).

## Print bibliography
PrintBibliography(bib, .opts = list(hyperlink = "to.doc", style = "html"))


Try the biocthis package in your browser

Any scripts or data that you put into this service are public.

biocthis documentation built on Feb. 28, 2021, 2:02 a.m.