The new lesson infrastructure has a lot of moving parts, but it is built on a solid foundation that clearly separates the content from the tooling needed. Importantly, in order to use the infrastructure, you do not need to know very much about how it works under the hood.
Importantly: this is not a beginner's guide. If you are looking that resource, then head over to The sandpaper documentation site. This document serves as a kind of "nerd's guide" to the lesson infrastructure. It steps back and takes a landscape-level view of how the infrastructure works and how it differs from the previous infrastructure.
Our lesson maintainers and contributors are all volunteers, which means that they only have a very limited amount of time to work on lessons. The lesson infrastructure should be a simple framework for contributors and maintainers to write lessons in a way that maximizes their attention on the lesson content.
The Jekyll-based lesson template has been showing its age recently and it's been clear for a while that a refresh is sorely needed. Chief among the issues with the Jekyll template is that it currently requires 4 languages (on top of Git) at a minimum to be installed in order to preview locally: Ruby, Python, Bash, and Make. This setup represents a large barrier for people who are coming in to lesson maintenance for the first time and is clearly an issue when we have to include videos for people to use a github-based workaround to preview their lessons.
Beyond the initial setup for contributors, another pain point is the reliance of the styles inside of the lesson itself and the convoluted manner in which a lesson must be created: it must be imported from the carpentries/styles repository so that the history is preserved and then the maintainer can initialize the lesson. When the time comes to update the lesson, we expect that the maintainer has not touched any of the styling elements or machinery so that the merge happens with no conflicts, but this is rarely the case and we have to initiate the merge and fix conflicts.
When we inspect the process of adding or updating content in a lesson, we find
that there are still more places of frustration. The configuration file
contains several pieces of information that are necessary for the site, but
must never be touched. Moreover, the header yaml of the lesson contains three
lists, objectives, keypoints, and questions, all formatted as strings and
included in the body of the rendered lesson. The issue with these is that it's
easy for these to become invalid due to misplaced punctuation. Linking to
images that lived at the top of the repository was difficult and authoring
special block quotes was challenging because writing content always preceded by
>
proved to be error prone.
With the new lesson infrastructure, we strive to clear the template clutter and allow contributors to focus on what is important: lesson content written in clear and easy-to-read markdown. Importantly, the lessons will still be portable and customisable because the engine, validator, and styling will all live in separate independent software packages. Lessons in the official Carpentries GitHub accounts will also benefit from automation that will provide timely updates to the elements needed to build the lessons (in the case for lessons with generated output).
This document will outline the aspects of the new lesson template to describe how it is organized in all aspects including folder structure of a lesson, content specifications for an episode, content specifications for extra-episode files (e.g. references), toolchain landscape, local build requirements, deployment specifications, and internationalization (i18n) specifications.
This document is intended to be extensive and is targeted for both lesson
maintainers and the infrastructure team. Because there are overly-detailed
descriptions of the infrastructure that maintainers are not required to know, I
will label these points with [i]
so that maintainers can skip these sections
if needed.
This section will discuss the structure of the template in terms of file organization and will not go into the aspects of specific tools.
The new lesson template is not designed to mimic aspects of the previous
template. This template is designed primarily for use in The Carpentries, but
should theoretically be extensible to other contexts. Most importantly,
contributors should only be expected to know markdown
and very minimal
yaml
syntax. in order to contribute to lessons.
Thus, there are a few rules that the new template should follow:
With a few exceptions, our lessons have shown to be largely accessible thanks to the reliance on modern frameworks for hosting our lessons. That being said, there are the occasional issues and the WCAG 2.1 accessibility guidelines provides a roadmap to ensure that our lessons are Perceivable, Operable, Understandable, and Robust.
The lesson template will be organized such that it clearly separates the lesson content from the lesson style:
|-- .gitignore # - Ignore everything in the site/ folder |-- .github/ # - Scripts used for continuous integration |-- episodes/ # - PUT YOUR MARKDOWN FILES IN THIS FOLDER | |-- data/ # -- Data for your lesson goes here | |-- figures/ # -- All static figures and diagrams are here | |-- files/ # -- Additional files (e.g. handouts) | `-- 00-introducition.Rmd # -- Lessons start with a two-digit number |-- instructors/ # - Information for Instructors | `-- instructor-notes.md # -- placeholder |-- learners/ # - Information for Learners | `-- setup.md # -- setup instructions (REQUIRED) |-- profiles/ # - Learner and/or Instructor Profiles | `-- learner-profiles.md # -- placeholder |-- renv/ # - Local package cache |-- site/ # - This folder is where the rendered markdown | | files and static site will live | `-- README.md # -- placeholder |-- config.yaml # - Use this to configure commonly used variables |-- index.md # - Front page for the site |-- CONTRIBUTING.md # - Carpentries Rules for Contributions (REQUIRED) |-- CODE_OF_CONDUCT.md # - Carpentries Code of Conduct (REQUIRED) |-- LICENSE.md # - Carpentries Licenses (REQUIRED) `-- README.md # - Introduces folks how to use this lesson and # where they can find more information.
All of the episodes and any content required for the episodes go in the
episodes/
folder (note no underscore prefix) such that if I were to extract
the episodes folder and give it to someone else, they should be able to execute
the RMarkdown documents in their own project using any method they see fit.
The file config.yaml
will be a minimal yaml file that contains metadata about
lesson-wide aspects (Author, Carpentry, Licence, Title, etc.) along with
specifications for episodes and extras to be included in the dropdown menus.
There are two sources of generated files in the template that are explicitly
not tracked by git, which both live in the site/
directory: the static
markdown cache generated from the source files and the local preview of HTML
files, which are generated from the static markdown files.
Separating the content generation from the episode from the assembly of the HTML site gives us a couple of advantages:
This two step process is explained in The Two-Step: Building Locally.
The static markdown cache of files will live in site/built
and will contain
static markdown documents with generated output and a special item in the yaml
header called sandpaper-digest
. The value for this item will be the md5 sum
of the corresponding source file so that the tool chain can determine which
files need to be rebuilt.
The local preview of the generated website will live in the site/docs
folder.
This site is ONLY generated from the static markdown cache in site/built
and
no other source.
The site/docs
folder can be shared anywhere as a fully-functional static
website. Many of the javascript and CSS elements are sourced from CDNs, but
these can also be bundled directly with the site itself (which increases the
size of storage).
The only thing preventing these lessons from being used offline is the fact that they rely on the CSS/JS framework being delivered by a CDN, but this may not be feasible for workshops taught in regions where internet connectivity is limited (though the browser cache takes care of this for the most part). To accomodate this, there will be a procedure that creates a standalone folder of the lesson and writes it outside of the repository so that it can be copied to a flash drive or delivered via a local WiFi router.
All episodes for the Carpentries should be stand-alone markdown documents that can render to valid HTML via pandoc without external dependencies. The structure of an episode is largely free-form, but there are certain elements that should be included to ensure a valid episode in the form of yaml content, required information blocks, instructor notes, and properly closed fenced div tags.
The YAML metadata should contain three elements, title, teaching, and
exercises. The title
is a character string for the title. Markdown can be
used in the title. Both teaching
and exercises
are the number of cumulative
minutes required to teach the lesson and complete the exercises, respectively.
This is used to populate the syllabus at the beginning of the lesson
title: "Creating Examples: Yes, These *Are* Difficult" teaching: 20 exercises: 10
To reduce the amount of friction between what contributors write and what is displayed on the lesson, placement will be enforced during the validation procedure and will be validated by checking the proximity of these blocks to other elements of the document in the XML representation. If the blocks do not comply, a human-readable alert will be generated indicating where the block is and where it should be moved (example error):
Error: File: '/path/to/episode.Rmd' The Questions block should be at the top of the document (lines AA--BB). Instead, they are located at lines XX--YY. Please edit '/path/to/episode.Rmd' and move the block at lines XX--YY to the top of the document (lines AA--BB).
Carpentries lessons have traditionally kept instructor notes as a separate page
from the lessons, but this lead to the notes not being used and some confusion
about HOW to use them. It has been proposed that we keep the instructor notes
inside of the lessons themselves at the risk of creating a longer and even more
complicated structure for maintainers to keep track of. To acommodate this, we
propose a new class of tag called instructor
, which will be hidden by default
and have a toggle that can display the notes alongside the lesson.
The instructor
tag will be the same as the other tags, starting with at least
three colons and ending with at least three colons:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor ## SLOW DOWN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Practically, when there are instructor notes in a lesson, they will be rendered
as separate pages and placed in a file called instructor/
that can be accessed
via /path/to/lesson/instructor/page.html
All open tags must be balanced by a closing tag.
While the specification for fenced divs are fairly loose, for clarity's sake, we have a few rules of thumb. Because the tags will not necessarily be color-coded, it's a good idea to differentiate between these tags with length. It does not need to be any specific length, but it's a good idea to make sure that the different level tags should be clearly visible.
# Lesson 1 This is an example lesson. :::::::::::::::::::::::::::::::::::::: objectives - Write good ::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::: questions - Is it okay if the number of colons don't match up? - Are the sections visible? :::::::::::::::::::::::::::::::::::::::::::::::: ## Introduction Writing tags looks like a bunch of `:` elements. :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor ### Example This is an example of how you write an instructor tag ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Let's try a challenge. :::::::::::::::::::::::::::::::::::::: challenge Write a solution ::::::::::::::: solution NaCl + H2O :::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::
For Carpentries Lessons, contributors need to have ways of specifying links that can access any part of the lesson online or offline. We also need to make sure that these links preserve the integrity of incoming links because people who have linked to lessons in the past will not want to discover that their links are suffering from rot.
The ultimate goal is to make sure that lesson contributors and authors do not have to think much beyond what they see in order to create a link. The simplest way to do this is to have the authors use relative links with an html tag at the end:
[link text](episode-name.html)
![Alt text](fig/figure-name.png)
[link
text](../extras/references.html)
Code Blocks in a lesson are formatted the same way as code blocks formatted in commonmark. They are represented by three backtics followed by the language you want to use for syntax highlighting in that block.
Because we are encouraging the use of RMarkdown, you can have code blocks that actually evaluate code using the RMarkdown chunk syntax of three backtics followed by a pair of curly braces with the name of the language engine you want to use for executing the code blocks.
At the moment, we explicitly support evaluation for R and BASH, but we will be gathering input from maintainers about processes to expand. One promising project is Quarto, which extends the functionality of R Markdown to other languages in a cross-platform, stand-alone binary program.
There is a large list of language engines that RMarkdown supports. Most engines require that you have the language available on your PATH and do not share variables and values between chunks, so lesson authors will need to be explicit about the installation requirements for these lessons. Some languages like Python, Julia, and SQL can share data and variables between chunks with the help of specific packages (e.g. {reticulate} for Python, {JuliaCall} for Julia, and {DBI} for SQL databases).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.