knitr::opts_chunk$set(echo = TRUE, results = 'asis')
r6obj_docstat <- rmddochelper::R6ClassDocuStatus$new()
# r6obj_docstat$setProject(psProject = "rmddochelper")
#r6obj_docstat$setVersion(psVersion = "0.0.901")
#r6obj_docstat$setStatus(psStatus = "Initialisation")
# r6obj_docstat$setVersion(psVersion = "0.0.902")
# r6obj_docstat$setStatus(psStatus = "Extended Introduction")
# r6obj_docstat$setVersion(psVersion = "0.0.903")
# r6obj_docstat$setStatus(psStatus = "Describe replacement function")
# r6obj_docstat$setVersion(psVersion = "0.0.904")
# r6obj_docstat$setStatus(psStatus = "Description of document status")
# r6obj_docstat$setVersion(psVersion = "0.0.905")
# r6obj_docstat$setStatus(psStatus = "Minor improvements on structure and description")
# r6obj_docstat$setVersion(psVersion = "0.0.906")
# r6obj_docstat$setStatus(psStatus = "Table of abbreviation started")
#r6obj_docstat$setVersion(psVersion = "0.0.907")
#r6obj_docstat$setStatus(psStatus = "Including graphics")
r6obj_docstat$set_current_status(psVersion = "0.0.908",
                                 psStatus  = "New version of setting document status",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.909",
                                 psStatus  = "New version of multi-add document status",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.910",
                                 psStatus  = "New version of update document status",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.911",
                                 psStatus  = "New autoincrement of version numbers",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.912",
                                 psStatus  = "Removed old private fields",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.913",
                                 psStatus  = "Specify package options",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.914",
                                 psStatus  = "Active bindings in R6",
                                 psProject = "rmddochelper")
r6obj_docstat$set_current_status(psVersion = "0.0.915",
                                 psStatus  = "Test temporary document cleanup",
                                 psProject = "rmddochelper")


r6obj_docstat$include_doc_stat(psTitle = "## Document Status")

\pagebreak

r6ob_abbrtable <- rmddochelper::R6ClassTableAbbrev$new()
r6ob_abbrtable$include_abbr_table(psAbbrTitle = "## Abbreviations")

Disclaimer

This document is a collection of ideas, plans and specifications that are important for developing the package rmddocuhelper.

Introduction

Creating documents is easy using tools like Rstudio, rmarkdown, knitr and pandoc. The source of such a document is written in a special version of markdown called rmarkdown. In addition to ordinary markdown, rmarkdown allows for direct inclusion of R-statements in the document markup source. The knitr package is used to convert the rmarkdown sources into ordinary markdown source file. During that conversion all R-statements are evaluated and evaluation results are collected and included into the markdown document.

The transformation of rmarkdown sources to markdown by knitr is analogous to what Sweave does with LaTeX source files. The advantage of using knitr and rmarkdown is that we have to use considerably less markup instructions compared to Sweave and LaTeX. On the other hand, if we really need the expressive power of LaTeX, we can include chunks of LaTeX statements which are automatically converted by the latter used conversion tool called pandoc.

The document conversion tool pandoc converts the produced markdown document into many different output formats, such as HTML or PDF. Although, pandoc is a very powerful conversion tool, there are limits when trying to convert LaTeX sources into output formats such as docx. One example of such a restriction is when graphics are included in markdown files using the LaTeX statement \includegraphics, those diagrams will not show up in a docx-formatted document that is produced by pandoc. In that case one has to stick to the markdown syntax for including graphics.

Objective

The above mentioned tool chain consisting of Rstudio, rmarkdown, knitr and pandoc is great for starting documents via the GUI provided by Rstudio. Whenever the number of documents to be created increases above a certain level and the structure of the created documents is very similar, a more scalable and more flexible approach for document creation is needed.

Scalability and flexibility can be increased by replacing the GUI by a function-based UI when creating new documents and by the very heavy usage of templates. That means, we provide simple functions that create documents based on templates. The document creation functions can be called with parameters that are used to replace placeholders in the templates.

Template based document creation

As an example, we assume the following template

docu_template <- "---
title: [REPLACE_WITH_TITLE]
author: [REPLACE_WITH_AUTHOR]
date: [REPLACE_WITH_DATE]
output: rmarkdown::html_vignette
---
## [REPLACE_WITH_SECTION]" 

We want to replace all the [REPLACE_WITH_*] tags with information specific for the document to be created. The replacement information will be provided in a list (repl_info) that has names that match the end of the tags.

repl_info <- list(title   = "My First Document",
                  author  = "Peter von Rohr",
                  date    = format(Sys.Date(), "%Y-%m-%d"),
                  section = "First Section")

#' Convert a tag to a replacement placeholder
#'
#' @param    psTag   tag to be converted to a placeholder
#' @return   replacement placeholder
get_repl_tag <- function(psTag){
  return(paste0("[REPLACE_WITH_", toupper(psTag), "]"))
}

#' Replacement of placeholder by replacement list values
#'
#' @param    ps_docu_template   document template string
#' @param    pl_repl_info       replacement info list
#' @return   s_docu_result      document result string
gsub_templ_pattern <- function(ps_docu_template, pl_repl_info){
  s_docu_result <- ps_docu_template
  for (sTag in names(pl_repl_info)){
    s_docu_result <- gsub(get_repl_tag(psTag = sTag), 
                          pl_repl_info[[sTag]], 
                          s_docu_result, 
                          fixed = TRUE)
  }
  return(s_docu_result)
}

The replacement of the placeholders is done in function gsub_templ_pattern(). That function takes two arguments

  1. the document template as a string and
  2. the association of placeholders to replacement values as a list.

The result corresponds to the initial version of the document where all placeholders are replaced by their corresponding values. Placeholders are replaced by a loop over the names of the replacement list. Each name is converted to a placeholder using the helper function get_repl_tag(). The function gsub() replaces the placeholders in the template string with the appropriate replacement values.

The result after the replacement is shown below.

### # output result of replacement
cat("```\n", gsub_templ_pattern(ps_docu_template = docu_template, pl_repl_info = repl_info), "\n```\n")

Document status

The status of a document is shown in a separate section that contains one table. That table contains information about the version of the document, the date, the author, the status and the project to which this document is associated to. This information is important mostly for internal use. Whenever a document is published or sent to external institutions, that information should most likely be hidden in the final output. Hence a useful feature is the possibility to easily be able to hide the whole section on the document status.

In this package rmddocuhelper, the document status is implemented in a R6 class called R6ClassDocuStatus. This class has fields to store the version, the data, the author, the status and the project. Apart from the getter and setter methods of the fields, there is only one public method that produces the table containing the document status. The resulting output of this method is shown at the beginning of this document.

Abbreviation table

Abbreviations are convenient to use for authors because it saves them writing long terms all the time. For readers abbreviations can be a problem, if their meaning is not explained in the text. One way of solving this problem is to put somewhere in the text a table that explains all abbreviations.

Setting up the table of abbreviations manually is tedious and error-prone. Hence we are looking for a way to generate the table of abbreviations automatically. Before the table of abbreviations can be generated, we have to collect all the abbreviations together with their meanings in the text. This information is used to construct a dataframe which is then shown as table of abbreviations. Because we want to be able to place the table of abbreviations in any position in the text, we have to collect the information about all abbreviations and their meanings in a separate scan over the source text. Since we want to hide implementation details from the user, we choose to use R6 classes and methods to implement this functionality.

In technical documents, there are often many r r6ob_abbrtable$add_abbrev(psAbbrev="TLA", psMeaning="Three Letter Acronyms") included. Most often authors assume that they are known, but not all readers are familiar with them. Hence it is always a good idea to include a r r6ob_abbrtable$add_abbrev(psAbbrev="ToA", psMeaning="Table of Abbreviations").

In this package rmddochelper the table of abbreviation is build automatically, but it requires that the documents are knit twice.

Including Graphics

Graphics can always be included using ordinary markdown or LaTeX syntax. But if graphics are converted from a different format into pdf or png, then it is more convenient to have a tool that does automatic conversion. This can be done using the function insertOdgAsPdf() or alternatively insertOdgAsPng(). These functions are just two wrappers to the same underlying function that takes as input the filename of a graphic in odg format and automatically converts it into a given output format. This conversion is done using LibreOffice which must be installed and available on the search path. For a given graphic file, it can be included using the following function call.

rmddochelper::insertOdgAsPng(psOdgFileStem = "RandomScreenShot2")

When knitting the rmarkdown file to pdf, it makes more sense to use the function insertOdgAsPdf().

In many documents, graphics to be inserted are based on screenshots or other images which are not directly produced by one of the R-graphics systems. Then it would be convenient to have the command to include the graphic directly inserted into the Rmarkdown source file. Screenshots are included in Rmarkdown source documents by pasting the screenshot into an empty LibreOffice graphics file which is opened by the function rmddochelper::create_odg_graphic(). The the file is saved. In the Rmarkdown source file, the graphics file is included using rmddochelper::insertOdgAsPng() or rmddochelper::insertOdgAsPdf().

In the current version, if we start by including a chunk with the label that is the same as the name of the graphics file, the creation function create_odg_graphic() of the odg-graphics includes the statement that inserts the graphic file directly into the Rmarkdown source file.

Package global options

In package knitr you have the possiblities to specify package global options. These options are stored in

knitr::opts_knit

which outputs a list with default options. They include setter- and getter-like methods.

Something similar can be done with RC or R6 classes where methods are called without parenthesis. The concept that allows for this is called active binding. The introductory vignette of the R6 package describes active binding. Active bindings are like fields but each time they are accessed, they call a function. They are always publically visible.

Numbers <- R6::R6Class("Numbers",
  public = list(
    x = 100
  ),
  active = list(
    x2 = function(value) {
      if (missing(value)) return(self$x * 2)
      else self$x <- value/2
    },
    rand = function() rnorm(1)
  )
)

n <- Numbers$new()
n$x

When an active binding is accessed, as if reading out a value, the function is called with the argument value treated as being missing.

n$x2

When a value is assigned, the assigned value is taken as an argument to the function

n$x2 <- 1000
n$x

If the function does not take any argument, the assignment is not possible.

n$rand

We want to explore active bindings of R6 classes to implement options that persist on different levels. A first and an important level of persistence is the level of a given document. That means we want to be able to specify a set of options that are persistent for a given document. This is comparable to what is usually done with knitr::opts_knit$set() at the beginning of each Rmarkdown source document.

Cleanup of temporary document compilation output

The function cleanup_output() was changed such that the data to be cleaned up is determined by the argument psDocuPath. The function no longer has an argument psDocuName that allowed for specification of the source document from which the temporary output was produced. The reason for dropping the argument psDocuName was that it was unclear how psDocuName and psDocuPath interacted. Now the behavior is much clearer. Potential data to be cleaned up is searched in the directory given by psDocuPath and is selected according to the patterns that are specified. The user is presented with a list of files and or directories that are found and asked whether the files should be deleted. If the answer is y then the files are deleted.

r6ob_abbrtable$writeToTsvFile()


charlotte-ngs/rmddochelper documentation built on June 27, 2019, 1:22 a.m.