In zkamvar/up2code: Explore and Manipulate Markdown Curricula from the Carpentries

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "##"
)

Introduction

If you have a list of Episode objects, you can achieve most of what you can with the lesson objects, because much of the Lesson object's function is to provide a methods that map over all of the Episode object methods. The key difference between Lesson objects and a list of Episode objects is that the Lesson object will collapse summaries and to map relations between Episodes and their children or parent documents.

Before you read this vignette, please read the vignette on the Episode object (vignette("intro-episode", package = "pegboard")) so that you can understand the methods that come from the Episode objects. In this vignette, we will talk about the structure of Lesson objects, how to use the basic methods of these objects, inspecting summaries of source vs built episodes, and assessing the lineage of episodes that have parents and/or children documents.

But first, because of a default parameter that influences what methods can be used depending on the lesson context, I need to explain a little bit about Jekyll, the former lesson infrastructure.

A Brief Word About History and Jekyll

Prior to The Workbench, we had the styles repository, which was an all-in-one toolbox that built websites with Jekyll. It was colloquially known as the "Lesson Template." It has two major differences to The Workbench: folder structure and syntax.

Folder Structure

The folder structure of lessons built with Jekyll was one where content and tooling lived side-by-side. This folder structure looked something like this:

writeLines(".
├── Gemfile
├── Makefile
├── _config.yml
├── \033[01;34m_episodes/\033[0m
│   └── 01-intro.md
├── \033[01;34m_episodes_rmd/\033[0m
├── \033[01;34m_extras/\033[0m
│   ├── a.md
│   └── b.md
├── \033[01;34m_includes/\033[0m
├── \033[01;34m_layouts/\033[0m
├── aio.md
├── \033[01;34massets/\033[0m
├── \033[01;34mbin/\033[0m
├── \033[01;34mfig/\033[0m
├── index.md
├── reference.md
├── requirements.txt
└── setup.md
")

When {pegboard} was first written, we initially assumed this folder structure, where R Markdown and regular Markdown episodes lived in different folders (and more often than not, the outputs of the R Markdown files lived inside the _episodes/ folder. The main method of organising episodes was by numbers embedded in the name of the files.

As The Workbench developed, it was clear that this folder structure needed to change, but we needed to keep compatibility with the old lessons because we want to ensure that people can independently convert from the old style lessons to the new style, thus we added the jekyll parameter to the Lesson object initializer method, and set jekyll = TRUE as the default to keep backwards compatibility.

Creating a New Lesson Object

The Lesson object is invoked with the Lesson$new(), method. Here, I will demonstrate a Workbench lesson. This is the folder structure of the workbench lesson:

withr::with_dir(pegboard::lesson_fragment("sandpaper-fragment"), {
  fs::dir_tree(regex = "site/[^R].*", invert = TRUE)
})

To read it in, because we have a Workbench lesson, we need to specify jekyll = FALSE to register all the div tags and ensure that the lesson is being treated like a Workbench lesson.

library("pegboard")
library("glue")
library("yaml")
library("xml2")
library("fs")
wbfragment <- lesson_fragment("sandpaper-fragment")
print(wbfragment) # path to the lesson
wb_lesson <- Lesson$new(wbfragment, jekyll = FALSE)
print(wb_lesson)

The Lesson printing here shows that it has a subset of methods that are named similarly to methods and active bindings from the Episode class. These are not inherited, but rather they are implemented across all the Episode objects. The Episode objects themselves are parsed into one of three elements: "episodes", "children", and "extra" (NOTE: the "extra" slot may be superceded by elements that better match the folder structure of lessons).

lapply(wb_lesson$episodes, class)
lapply(wb_lesson$children, class)
lapply(wb_lesson$extra, class)

Notice here that there is only one episode in the $episodes item, but in the directory tree above, we see two. This is because of the config.yaml file, which defines the order of the episodes:

read_yaml(path(wbfragment, "config.yaml"))$episodes

Because episodes/nope.Rmd is not listed, it is not read in. This is useful to avoid reading in content from files that are incomplete or not correctly formatted.

File Information

The Lesson object contains information about the file information:

# what is the root path for the lesson?
wb_lesson$path
# what episode files exist?
wb_lesson$files
# do any of the files have children (Workbench lessons only)? 
wb_lesson$has_children

Accessors

As mentioned earlier, many of the methods in a Lesson object are wrappers for methods in Episode objects. challenges, solutions are the obvious ones.

wb_lesson$challenges()
wb_lesson$solutions()

For the rest of the elements (or active bindings), you will need to use the $get() method. For example, if you wanted all code blocks from the episodes and the extra content, you would use:

wb_lesson$get("code", collection = c("episodes", "extra"))

Similarly, for links and headings you would use:

wb_lesson$get("links", collection = c("episodes", "extra"))
wb_lesson$get("headings", collection = c("episodes", "extra"))

Methods Summaries and Validation

For summaries, you will get a data frame of the summaries. You can also choose to include other collections in the summary:

wb_lesson$summary() # defaults to episodes
wb_lesson$summary(collection = c("episodes", "extra"))

Validation will auto-check everything and return the results as data frames. You can find more information abou the specific checks by reading vignette("validation", package = "pegboard").

Details of the individual functions can be found via ?validate_links(), ?validate_divs(), and ?validate_headings().

divs <- wb_lesson$validate_divs()
print(divs)
headings <- wb_lesson$validate_headings()
print(headings)
links <- wb_lesson$validate_links()
print(links)

Loading Built Documents

One thing that is very useful is to check the status of the built documents to ensure that everything you expect is there. You can load all of the built markdown documents with the $load_built() method and the built documents will populate the $built field:

wb_lesson$load_built()
lapply(wb_lesson$built, class)

You can use these to inspect how the content is rendered and see that the code blocks render what they should render. In thise case, episodes/intro.Rmd will render one output block and one image.

to_check <- c("page", "code", "output", "images", "warning", "error")
wb_lesson$summary(collection = c("episodes", "built"))[to_check]

Handouts

This is another method wrapped from the Episode method, where it combines the output into a single file and prepends the Episode title before each section:

writeLines(wb_lesson$handout())

Accessing other `Episode` methods

For pegboard::Episode methods that are not listed above, you will need to manually iterate over the Episode objects. For example, if you wanted to extract all of the instructor notes in the lesson, you could use purrr::map()

purrr::map(c(wb_lesson$episodes, wb_lesson$extra), 
  function(ep) ep$get_divs("instructor"))

If you wanted to get a specific thing from the body of the document, then you could use any of the functions from {xml2} such as xml2::xml_find_first() or xml2::xml_find_all(). Here, we are looking first the first text element that is not a fenced-div element:

purrr::map_chr(c(wb_lesson$episodes, wb_lesson$extra), 
  function(ep) {
    xpath <- ".//md:text[not(starts-with(text(), ':::'))]"
    nodes <- xml_find_first(ep$body, xpath, ep$ns)
    return(xml_text(nodes))
  }
)

For more information about constructing XPath queries and working with XML data, you can read vignette("intro-xml", package = "pegboard")

Creating a New Lesson with Child Documents

If you are unfamiliar with the concept of child documents, please read the "Including Child Documents" vignette in the {sandpaper} package (vignette("include-child-documents", package = "sandpaper")).

The pegboard::Lesson object is very useful with lessons that contain child documents because it records the relationships between documents. This is key for workflows determining build order of a Lesson. If a source document is modified, in any build system, that source document will trigger a rebuild of the downstream page, and the same should happen if a child document of that source is modified (if you are interested in the build process used by {sandpaper}, you can read sandpaper::build_status() and sandpaper::hash_children()). This functionality is implemented in the pegboard::Lesson$trace_lineage() method, which returns all documents required to build any given file. We will demonstrate the utility of this later, but first, we will demonstrate how pegboard::Lesson$new() auto-detects the child documents:

Take for example the same lesson, but episodes/intro.Rmd has the child episodes/files/cat.Rmd which in turn has the child episodes/files/session.Rmd:

withr::with_dir(lesson_fragment("sandpaper-fragment-with-child"), {
  fs::dir_tree(regex = "site/[^R].*", invert = TRUE)
})

A valid child document reference requires a code chunk with a child attribute that points to a valid file relative to the parent document, so if I have this code block inside episodes/intro.Rmd, then it will include the child document called episodes/files/cat.Rmd:

```r
```

During initialisation of a Workbench lesson (note: not currently implemented for Jekyll lessons), the Lesson object will detect that at least one Episode references at least one child document (via find_children()) and read them in (see load_children() for details).

wbchild <- lesson_fragment("sandpaper-fragment-with-child")
wb_lesson_child <- Lesson$new(wbchild, jekyll = FALSE)
wb_lesson_child$has_children
lapply(wb_lesson_child$children, class)

The reason it is useful is because if you have a child Episode object, you can determine its parent and its final ancestor. Because these paths are absolute paths, I am going to write a function that will use the {glue} package to print it nicely for us:

show_child_parents <- function(child) {
  parents <- fs::path_rel(child$parents, start = child$lesson)
  build_parents <- fs::path_rel(child$build_parents, start = child$lesson)

  msg <- "Ancestors for {child$name} ---
  Parent(s):         {parents}
  Final ancestor(s): {build_parents}"
  glue::glue(msg)
}

# cat.Rmd's parent is intro.Rmd
show_child_parents(wb_lesson_child$children[[1]])

# session.Rmd's parent is cat.Rmd
show_child_parents(wb_lesson_child$children[[2]])

If you have the name of the final ancestor, then you can determine the full lineage with the $trace_lineage() method, which is useful for determining if a file should be rebuilt:

parent <- wb_lesson_child$children[[2]]$build_parents
print(parent)
lineage <- wb_lesson_child$trace_lineage(parent)

# printing the lineage in a presentable fashion:
rel <- wb_lesson_child$path
pretty_lineage <- path_rel(lineage, start = rel)
pretty_lineage <- glue_collapse(pretty_lineage, sep = ", ", last = ", and ")
glue("The lineage of {path_rel(parent, start = rel)} is:
  {pretty_lineage}")

Jekyll Lessons

This section will talk about the peculiarities with lessons built with the carpentries/styles lesson template. Note that lesson transition methods are not implemented in the Lesson object, if you want to find out about methods for transition, please read the Jekyll section in vignette("intro-episode", package = "pegboard").

Syntax

Methods

There are some methods that are specific to lessons that are built with Jekyll. In particular, the n_problems and show_problems active binding are useful for determining if anything went wrong with parsing the kramdown syntax, the $isolate_blocks() method was used to strip out all non-blockquote content, the $blocks() method returned all block quote with filters for types, and the $rmd field was an indicator that the lesson used R Markdown.

jekyll <- Lesson$new(lesson_fragment("lesson-fragment"), jekyll = TRUE)
jekyll$n_problems
rmd    <- Lesson$new(lesson_fragment("rmd-lesson"), jekyll = TRUE)
rmd$n_problems

As mentioned above, in Jekyll uses special block quotes to format callout blocks. The $challenges() and $solutions() methods recognise this and will return the block quotes:

jekyll$challenges()
jekyll$solutions()

For other blocks, you can use the $blocks() method:

jekyll$blocks(".prereq")
rmd$blocks(".prereq")

zkamvar/up2code documentation built on Feb. 24, 2025, 7:32 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

zkamvar/up2code
Explore and Manipulate Markdown Curricula from the Carpentries

In zkamvar/up2code: Explore and Manipulate Markdown Curricula from the Carpentries

Introduction

A Brief Word About History and Jekyll

Folder Structure

Creating a New Lesson Object

File Information

Accessors

Methods Summaries and Validation

Loading Built Documents

Handouts

Accessing other `Episode` methods

Creating a New Lesson with Child Documents

Jekyll Lessons

Syntax

Methods

R Package Documentation

Browse R Packages

We want your feedback!

zkamvar/up2code Explore and Manipulate Markdown Curricula from the Carpentries

In zkamvar/up2code: Explore and Manipulate Markdown Curricula from the Carpentries

Introduction

A Brief Word About History and Jekyll

Folder Structure

Creating a New Lesson Object

File Information

Accessors

Methods Summaries and Validation

Loading Built Documents

Handouts

Accessing other Episode methods

Creating a New Lesson with Child Documents

Jekyll Lessons

Syntax

Methods

R Package Documentation

Browse R Packages

We want your feedback!

zkamvar/up2code
Explore and Manipulate Markdown Curricula from the Carpentries

Accessing other `Episode` methods