knitr::opts_chunk$set( collapse = TRUE, comment = "##" )
This is a vignette that is designed for R package developers who are looking to understand how the data flows between the lesson source and the final website. I am assuming that the person reading this has familiarity with R packaging and R environments.
One of the design philosophies of The Workbench is that if a lesson author or contributor would like to add any sort of metadata or modify a setting in any given lesson, they can do so by editing the source of the lesson with no need to modify their workflow.
The design of this is very much PDD (panic-driven design). I was okay with wracking up technical debt because I knew that I could go back and refactor once I got it released. If I had a chance to go back and refactor, I would gladly do so. The creation of muliple storage objects using function factories was what I knew at the time. I now know that it might be better to use global environments instead. The implementation of this was originally done in [pull request
deduplicate code after the release of version 0.1.0, which was a massive push after finally getting the new website layout.
The flow of data I lay out here could all live in the same package-level environment (or even a package-level R6 object) instead of being implemented as function factories, but that's a refactor for another day and another maintainer.
Metadata is different from content because, while it is not directly related to the content, it has extra information that helps the users of the site navigate the content. For example, each episode page has three piece of metadata embedded as a YAML list at the top of the source file that defines the title along with estimated time for teaching and excercises.
title: "Introduction" teaching: 5 exercises: 5
In terms of the lesson itself, metadata that is related to the whole lesson is
stored in config.yaml
. This YAML file is designed to be as flat as possible to
avoid common problems with writing YAML. Metadata here include things like the
title of the Lesson, the source page of the lesson, the time it was created, and
the lesson program it belongs to.
title: "An Example Lesson" carpentry: "incubator" created: "2023-11-27" life-cycle: "pre-alpha"
It also defines optional parameters such as the order of the episodes and other content that can be used to customise how the lesson is built.
episodes: - introduction.md - first-example.md handout: true
The thing to know about this metdata and these variables is that they all get passed to {varnish} and are used to control how the lesson website is built along with the metadata.
{varnish}
In order to understand how to pass data from config.yaml
to {varnish}, it is
important to first understand the paradigm of how {sandpaper} and {varnish} work
together to produce a lesson website. Behind the scenes, we build markdown with
{knitr}, render the raw HTML with Pandoc and then use {pkgdown} to insert the
HTML and metadata into a template (written in the logicless templating
language, Mustache) that can be
updated and modified independently of {sandpaper}. This template is called
{varnish}.
In the paradigm of {pkgdown}, the HTML template is split up into different components that are rendered separately and then combined into a single layout template. For example, here is the template that defines the layout:
writeLines("```html") tplt <- system.file("pkgdown", "templates", "layout.html", package = "varnish") writeLines(readLines(tplt)) writeLines("```")
You can see that it contains {{{ head }}}
, {{{ header }}}
, {{{ navbar }}}
,
{{{ content }}}
and {{{ footer }}}
. These are all the results of the other
rendered templates. For example, here's the head.html
template, which defines
the content that goes in the HTML <head>
tag (defining titles, metadata,
stylesheets, and JavaScript):
writeLines("```html") tplt <- system.file("pkgdown", "templates", "head.html", package = "varnish") writeLines(readLines(tplt)) writeLines("```")
The templates used by {varnish} will take metadata from one of two sources:
_pkgdown.yaml
file that defines global metadata like language (see the
pkgdown metadata section for details.pkgdown::render_page()
function. These
values can be both global and local.These get inserted into the template via the pkgdown::render_page()
function
where the contents of the _pkgdown.yaml
file are inserted into the data list
as $yaml
.
The challenge for {sandpaper} is this: when we have a list of per-lesson global
variables that need to be allocated when build_lesson()
is run and per-file
variables that need to be allocated before every call to build_html()
.
The solution that we've come up with is to store these lists in environments that are encapsulated by function factories, which are detailed in the next section.
There are two types of storage function
factories in {sandpaper}:
Lesson Store (.lesson_store()
) and List Store (.list_store()
). Both of these
return lists of functions that access their calling environments, which are
created when the package is loaded:
The list store acts like a persistent list that has get()
set()
, clear()
,
copy()
and update()
methods.
snd <- asNamespace("sandpaper") this_list <- snd$.list_store() names(this_list) class(this_list) # to set a list, use a NULL key this_list$set(key = NULL, list( A = list(first = 1), B = list( C = list( D = letters[1:4], E = sqrt(2), F = TRUE ), G = head(sleep) ) )) # get the list this_list$get() # copy a list so that you can preserve the original list that_list <- this_list$copy() # update list elements by adding to them (note: vectors will be replaced) that_list$update(list(A = list(platform = "MY COMPUTER"))) that_list$get() # set nested list elements that_list$set(c("A", "platform"), "YOUR COMPUTER") # note that modified copies do not modify the originals that_list$get()[["A"]] this_list$get()[["A"]]
This creates the object containing the pegboard::Lesson
object, that we can
use to extract episode-specific metadata along with the text of the questions.
It is a special object because when a lesson is set with this object, it will
additionally set the other global data.
When {sandpaper} is loaded, the List Store and Lesson Store objects are created and live in the {sandpaper} namespace as long as the session is active.
snd <- asNamespace("sandpaper") some_lesson <- snd$.lesson_store() names(some_lesson)
All of these metadata are collected and stored in memory before the lesson is
built (triggered during validate_lesson()
), which are accessible via the
objects defined by the internal sandpaper:::set_globals()
. Here I will set
up an example lesson that I will use to demonstrate these global variables.
Please note that I will be using internal functions in this demonstration. These
internal functions are not guaranteed to be stable outside of the context of
{sandpaper}. Using the asNamespace("sandpaper")
function allows me to create
a method for accessing the internal functions:
library("sandpaper") snd <- asNamespace("sandpaper")
# create a new lesson lsn <- create_lesson(tempfile(), name = "An Example Lesson", rstudio = TRUE, open = FALSE, rmd = FALSE) # add a new episode create_episode_md(title = "First Example", add = TRUE, path = lsn, open = FALSE)
Within {sandpaper}, there are environments that contain metadata related to the
whole lesson called .store
, .resources
, instructor_globals
,
learner_globals
, and this_metadata
. Before the lesson is validated, these
values are empty:
snd <- asNamespace("sandpaper") class(snd$.store) print(snd$.store$get()) class(snd$this_metadata) snd$this_metadata$get() class(snd$.resources) snd$.resources$get() class(snd$instructor_globals) snd$instructor_globals$get() class(snd$learner_globals) snd$learner_globals$get()
These will contain the following information:
.store
a pegboard::Lesson
object that contains the XML representation of
the parsed Markdown source filesthis_metadata
the real and computed metadata associated with the lesson
along with the JSON-LD template for generating the metadata in the footer of
the lesson..resources
a list of the availabe files used to build the lesson in the
order specified by config.yaml
instructor_globals
pre-computed global data that includes the sidebar
template, the syllabus, the dropdown menus, and information about the
packages used to build the lessonlearner_globals
same as instructor_globals
, but specifically for learner
viewWhen I run validate_lesson()
the first time, all the metadata is collected and
parsed from the config.yaml
, and the individual episodes and cached.
# first run is always longer than the second system.time(validate_lesson(lsn)) system.time(validate_lesson(lsn))
The validate_lesson()
call will pull from the cache in .store
if it exists
and is valid (which means that nothing in git has changed). If it's not valid or
does not exist, then all of the global storage is initialised in this general
cascade:
funs <- c( vl = "validate_lesson()", tl = "this_lesson()", sv = ".store$valid()", ggd = "gert::git_diff()", ggs = "gert::git_status()", ggl = "gert::git_log()", pl = "pegboard::Lesson$new()", ss = ".store$set()", sg = "set_globals()", srl = "set_resource_list()", res = ".resources$set()", grl = "get_resource_list()", im = "initialise_metadata()", sl = "set_language()", av = "add_varnish_translations()", tr = "tr_()", ts = "these$translations", gc = "get_config()", tm = "template_metadata()", tms = "this_metadata$set()", cs = "create_sidebar()", crd = "create_resources_dropdown()", lgs = "learner_globals$set()", igs = "instructor_globals$set()" ) labels <- funs to_lab <- c("ss", "res", "tms", "lgs", "igs", "ts") labels[to_lab] <- cli::col_cyan(cli::style_bold(labels[to_lab])) vltree <- data.frame( fn = funs, calls = I(list( vl = funs["tl"], tl = list(funs["sv"], funs["pl"], funs["ss"]), sv = list(funs["ggd"], funs["ggs"], funs["ggl"]), ggd = character(0), ggs = character(0), ggl = character(0), pl = character(0), ss = list(funs["sg"]), sg = list(funs["im"], funs["sl"], funs["srl"], funs["cs"], funs["crd"], funs["lgs"], funs["igs"]), srl = list(funs["grl"], funs["res"]), res = character(0), grl = list(funs["gc"]), im = list(funs["gc"], funs["tm"], funs["tms"]), sl = list(funs["av"]), av = list(funs["tr"]), tr = list(funs["ts"]), ts = character(0), gc = character(0), tm = character(0), tms = character(0), cs = character(0), crd = character(0), lgs = character(0), igs = character(0) ) ), labels = labels ) cli::tree(vltree)
Now when we these variables are called, you can see the information stored in them.
This function stores the pegboard::Lesson
object and is responsible for
initialising and resetting the variable cache.
snd <- asNamespace("sandpaper") class(snd$.store) print(snd$.store$get())
We can check if the cache needs to be reset by using the $valid()
function
inside this object, which checks the git log, git status, and git diff as a
global check of the lesson contents. When nothing changes, it returns TRUE:
snd$.store$valid(lsn)
However, if we update the lesson contents in some way by setting a config
variable, it will be FALSE
, indicating that it needs to be reset:
set_config(c(handout = TRUE), path = lsn, write = TRUE, create = TRUE) snd$.store$valid(lsn) snd$.store$set(lsn) snd$.store$valid(lsn)
The metadata is used to store the content of config.yaml
and to provide
computed metadata for a lesson that is included in the footer as a JSON-LD
object, which is useful for indexing. Note that this metadata must be
duplicated for each page to give the correct URL and identifiers.
snd <- asNamespace("sandpaper") snd$this_metadata$get()
This metadata is rendered as JSON-LD and passed as a new variable to {varnish}
using the internal fill_metadata_template()
function:
writeLines(snd$fill_metadata_template(snd$this_metadata))
The next thing that we store globally are the resources we use to build the lesson. This allows us to avoid needing to constantly read the file system:
snd <- asNamespace("sandpaper") snd$.resources$get()
The rest are global and local variables that are recorded in the
instructor_globals
and learner_globals
. These are copied for each page and
updated with local data (e.g. the sidebar needs to include headings for the
current page).
snd <- asNamespace("sandpaper") snd$instructor_globals$get() snd$learner_globals$get()
In the majority of the {varnish} templates are keys that need translations that
are in the format of {{ translate.ThingInPascalCase }}
:
writeLines("```html") writeLines(readLines(system.file("pkgdown/templates/content-overview.html", package = "varnish"))) writeLines("```")
These variables are in {sandpaper} and are expected to exist. Because they
are expected to exist, theese variables are generated and stored in the internal
environment these$translations
by the function establish_translation_vars()
when the package is loaded. When the lesson is validated, these variables are
translated to the correct language with set_language()
and placed in the
instructor_globals
and learner_globals
storage. These variables are passed
directly to {varnish} templates, which eventually get processed by {whisker}:
snd <- asNamespace("sandpaper") whisker::whisker.render("Edit this page: {{ translate.EditThisPage }}", data = snd$instructor_globals$get() )
This is key to building lessons in other languages, regardless of your default
language. The lesson author sets the lang:
config key to the two-letter
language code that the lesson is written in. This gets passed to the
set_language()
function, which modifies the translations inside the global
data, but it does not modify the language of the user session:
snd <- asNamespace("sandpaper") snd$set_config(c(lang = "es"), path = lsn, create = TRUE, write = TRUE) snd$this_lesson(lsn) whisker::whisker.render("Edit this page: {{ translate.EditThisPage }}", data = snd$instructor_globals$get() ) Sys.getenv("LANGUAGE")
Switching the language is controlled entirely from within the lesson config:
snd$set_config(c(lang = "en"), path = lsn, create = TRUE, write = TRUE) snd$this_lesson(lsn) whisker::whisker.render("Edit this page: {{ translate.EditThisPage }}", data = snd$instructor_globals$get() )
Another source of metadata is that created for the pkgdown in a file called
site/_pkgdown.yaml
.
writeLines("```yaml") writeLines(readLines(fs::path(lsn, "site", "_pkgdown.yaml"))) writeLines("```")
This file is what allows {pkgdown} to recognise that a website needs to be
built. These variables are compiled into the pkg
variable that is passed to
all downstream files from build_site()
. The important elements we use are
$lang
stores the language variable$src_path
stores the source path of the lesson (the site/
directory, or
the location of the SANDPAPER_SITE
environment variable).$dst_path
stores the destination path to the lesson (site/docs
).
NOTE: this particular path could be changed in build_site()
so that
we update the name of the path so that it's less confusing. The docs folder
was a mechanism for pkgdown to locally deploy the site without having to
affect the package structure (it was also a mechanism to serve GitHub pages
in the past).$meta
a list of items passed on to {varnish}snd <- asNamespace("sandpaper") pkg <- pkgdown::as_pkgdown(snd$path_site(lsn)) pkg[c("lang", "src_path", "dst_path", "meta")]
Before being sent to be rendered, it goes through one more transformation, which
is where the variable contexts you find in {varnish} come from. The contexts
that we use are site
for the root path and title, yaml
for lesson-wide
content, and lang
to define the language. Everything else is not used.
dat <- pkgdown::data_template(pkg) writeLines(yaml::as.yaml(dat[c("lang", "site", "yaml")]))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.