#| label: init #| include: false knitr::opts_knit$set(root.dir = getwd())
r pal::desc_get_field_safe("Package")
#| label: pkg-desc #| child: !expr pkgsnip::snip_path("pkg-desc.Rmd")
The basic idea behind the concept this package implements originates from Yihui Xie. See his blog post Write An R Package Using Literate Programming
Techniques for more details, it's definitively worth reading. This package's function
pkgpurl::purl_rmd()
is just a less cumbersome alternative to the Makefile approach outlined by him.
The R Markdown format provides several advantages over the bare R source format when developing an R package:
It allows the actual code to be freely mixed with explanatory and supplementary information in expressive Markdown
format instead of having to rely on #
comments only. In general, this should
encourage to actually record code-accompanying information because you're able to use the full spectrum of Pandoc's Markdown
syntax like inline formatting, lists, tables, quotations or math[^1].
It is especially powerful in combination with the Visual R Markdown feature introduced in RStudio 1.4, which -- in addition to the visual editor -- offers a feature whose utility can hardly be overestimated: Pandoc Markdown canonicalization (on file save[^2]). For example, it allows paragraphs being wrapped automatically at the desired line width; or to write a minimal sloppy pipe table that is automatically normalized to a beautifully formatted and actually readable one.
The relevant editor options which adjust the canonical Markdown generation can either be set
per .Rmd
file, e.g.
editor_options: markdown: wrap: 160 references: location: section canonical: true
```
or per project in the usual PACKAGE_NAME.Rproj
file, e.g.
ini
MarkdownWrap: Column
MarkdownWrapAtColumn: 160
MarkdownReferences: Section
MarkdownCanonical: Yes
(I'd recommend to set them per project, so they apply to the whole package including any .Rmd
vignettes.)
The traditional recommendation to not lose overview of your package's R source code is to split it over multiple files. The popular (and very useful) book R Packages gives the following advice:
If it's very hard to predict which file a function lives in, that suggests it's time to separate your functions into more files or reconsider how you are naming your functions and/or files.
I think this is just ridiculous.
Instead, I encourage you to keep all your code (as far as possible) in a single file Rmd/PACKAGE_NAME.Rmd
and structure it according to the rules described
here, which even allows the pkgdown Reference:
index to be automatically in sync with the source code structure. As a result, you
re-organize (and thus most likely improve) your package's code structure whenever you intend to improve the pkgdown reference -- and vice versa. For a basic
example, see this very package's main source file.
Keeping all code in a single file frees you from the traditional hassle of finding a viable (but in the end still unsatisfactory) way to organize your R source
code across multiple files. Of course, there are still good reasons to outsource code into separate files in certain situations, which nothing is stopping you
from doing. You can also exclude whole .Rmd
files from purling using the .nopurl.Rmd
filename
suffix.
You can rely on RStudio's code outline to easily navigate through longer
.Rmd
files. IMHO it provides significantly better usability than the code section
standard of .R
files. It makes it easy to find your way around source files
that are thousands of lines long.
RStudio's Go to File/Function shortcut works the same for .Rmd
files as it does
for .R
files.
If you use RStudio or any other editor with proper R Markdown syntax highlighting, you will probably like the gained visual clarity for distinguishing individual functions/code parts (by putting them in separate R code chunks). This also facilitates creating a meaningful document structure (in Markdown) alongside the actual R source code.
You can put development-only code which never lands in the generated R source files (and thus the R package) in separate code chunks with the chunk option
purl = FALSE
. This turns out to be very convenient in certain situations.
For example, this is a good way to reproducibly document the generation of cleaned versions of exported data as
well as internal data. This avoids having to outsource the code to separate files under data-raw/
and adding
the directory to .Rbuildignore
, i.e. no need to use usethis::use_data_raw()
. Instead, you just set purl = FALSE
for the relevant code chunk(s). You can
(and should) still use usethis::use_data()
(optionally with overwrite = TRUE
) to generate the files
under data/
holding external package data as well as the R/sysdata.rda
file (using internal = TRUE
) holding internal package data.
If you use styler to auto-format your code globally by setting
knitr::opts_chunk$set(tidy = "styler")
, you can still opt-out on a per-chunk basis by
setting tidy = FALSE
. This gives pleasant flexibility.
Unfortunately, there are also a few notable drawbacks of the R Markdown format:
The pkgpurl approach on writing R packages in the R Markdown format introduces one additional step at the very beginning of typical package development
workflows: Running pkgpurl::purl_rmd()
to generate the R/*.gen.R
files from the original Rmd/*.Rmd
sources before documenting/checking/testing/building the package. Given sufficient user demand, this could probably be integrated into
devtools' functions in the future, so that no additional action has to be taken by the user when relying on RStudio's built-in
package building infrastructure.
For the time being, it's recommended to set up a custom shortcut[^3] for one or both of
pkgpurl::purl_rmd()
and
pkgpurl::process_pkg()
which are registered as RStudio
add-ins.
Setting up a new project to write an R package in the R Markdown differs slightly from the classic approach. A suitable convenience function like
create_rmd_package()
to set up all the necessary parts could probably be added to usethis in the future.
For the time being, you can use my ready-to-go R Markdown Package Development Template as a starting point for creating new R packages in the R Markdown format.
Debugging can be a bit more laborious since line numbers in warning and error messages always refer to the generated R/*.gen.R
file(s), not the underlying
Rmd/*.Rmd
source code file(s). If need be, you first have to look up the line numbers in the R/*.gen.R
file(s) to understand which function / code parts
cause the issue in order to know where to fix it in the Rmd/*.Rmd
source(s).
Other than in .R
files, RStudio currently doesn't support auto-completion of roxygen2 tags in .Rmd
files and
its Reflow Comment command doesn't properly work on them. These are known
issues which will hopefully be resolved in the near future.
#| label: pkg-doc #| eval: !expr '!isTRUE(getOption("pal.build_readme.is_pkgdown"))' #| results: asis #| echo: false pkgsnip::md_snip(id = "pkgdown_site") %>% paste0("## Documentation\n\n", "[]", "(https://app.netlify.com/sites/pkgpurl-rpkg-dev/deploys)\n\n", .) %>% pal::cat_lines()
[^1]: Actually, you could write anything you like in any syntax outside of R code chunks as long as you don't mind the file to be knittable (which it doesn't have to be).
[^2]: It basically sends the (R) Markdown file on a "Pandoc round trip" on every file save.
[^3]: I personally recommend to use the shortcut Ctrl+Shift+V since it's not occupied yet by any of the predefined RStudio shortcuts.
#| label: pkg-instl-dev #| child: !expr pkgsnip::snip_path("pkg-instl-dev-gitlab.Rmd")
#| label: pkg-usage #| eval: !expr isTRUE(getOption("pal.build_readme.is_pkgdown")) #| results: asis #| echo: false pkgsnip::md_snip(id = "pkg_usage") %>% paste0("## Usage\n\n", .) %>% pal::cat_lines()
#| label: pkg-config #| child: !expr pkgsnip::snip_path("funky-config.Rmd")
#| label: pkgpurl #| child: !expr pkgsnip::snip_path("pkgpurl.Rmd")
#| label: pkg-code-style #| child: !expr pkgsnip::snip_path("pkg-code-style.Rmd")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.