knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#|",
  R.options = list(width = 100, str = strOptions(strict.width = "cut"))
)

Introduction

This package is designed to parse the structure of the Markdown and R Markdown files inside a Carpentries-style lesson. To ensure consistent lesson structure, it has built-in validators that will inspect headings, links, and callout sections. This document will help you become familiar with the output of the validators and how they can be used to update a lesson.

Lesson authors and contributors will likely only see the output of these validators and not need to interact with them directly. For example, we have a lesson that contains some mal-formed links and headings:

library(pegboard)
lsn <- Lesson$new(lesson_fragment())
# validation
lsn$validate_divs()
lsn$validate_links()
lsn$validate_headings(verbose = FALSE)

We can see that there is informative and actionable output produced that will allow lesson authors to make targeted changes in their files. The next sections will detail how we can work with the output of these methods.

Link Validation

Each of the validation methods will produce messages (aka stderr) for the user, but they also produce a data frame as output that contains detailed information about each link and what tests it passed or did not pass. You can find out what tests are run on links by accessing the help page for validate_links by typing ?validate_links in your R console.

If we want to inspect this data frame, we can assign it to a new variable:

links <- lsn$validate_links()
str(links, max.level = 1)

This data frame is combined output of three sources:

  1. the output of [xml2::url_parse()]
  2. the source data containing the original source file (filepath), line number (pos), and XML node (node)
  3. the tests performed as logical vectors, which can be extracted programmatically

Here, we can see the nodes that did not pass our tests by using the link_tests vector, which is an internal vector defining the inline messages printed for each failed test, to filter the output:

# get the subset of rows that did not pass all the tests
invalid <- !apply(links[names(link_tests)], MARGIN = 1L, all)
# return the nodes
links$node[invalid]

For context, this is what the document looks like at those positions:

lsn$episodes[["14-looping-data-sets.md"]]$tail(17)

Fenced Div Validation

Validation of fenced divs (aka callout blocks) at the moment checks for whether or not the section divs we encouter in the lessons are the ones we expect, to avoid issues where a div class is mis-typed.

The divs we expect are in the object pegboard::KNOWN_DIVS: r glue::glue_collapse(pegboard::KNOWN_DIVS, sep = ", ", last = ", and ").

Our example lesson is from the old styles repository, so it does not have any fenced divs, but we can use a lesson fragment from {sandpaper}:

snd <- Lesson$new(lesson_fragment(name = "sandpaper-fragment"), jekyll = FALSE)
snd_divs <- snd$validate_divs()

Notice that no message was produced indicating that the divs in our lesson fragment were okay. When we look at the data frame that was produced, we can see that there are six divs:

snd_divs

If there are invalid div names in the lesson, they will be reported. For example, the lesson we have right now, does not have any improper fenced div classes, but if we were to add an invalid fenced div (via the add_md() method in the tinkr::yarn() class), we would be able to find out very quickly:

our_div <- c("::: exercise", "\nthis is an invalid div\n", ":::")
snd$episodes[["intro.Rmd"]]$add_md(our_div)
snd_divs <- snd$validate_divs()

You can see from the table that the is_known column now has a FALSE value:

snd_divs

Heading Validation

NOTE: We are rethinking the exact specifications for heading validation at this time

Validation of headings operate similarly to links as it produces a data frame along with the message output that can be further inspected and manipulated. You can find out more by accessing the help documentation for validate_headings by typing ?validate_headings in your R console.

headings <- lsn$validate_headings(verbose = TRUE)
str(headings, max.level = 1)

This particular data frame has fewer rows because there are fewer moving parts to headings than links, but they are indeed important. The process for getting the subset of invalid headings is similar: use the heading_tests vector from pegboard to subset the rows that failed:

# get the subset of rows that did not pass all the tests
invalid <- !apply(headings[names(heading_tests)], MARGIN = 1L, all)
# return the nodes
headings$node[invalid]

Internal testing

For testing, we have created some documents that have extreme cases of links and headings that failed to show all the features of our examples:

ex  <- here::here("tests/testthat/examples")
lnk <- Episode$new(fs::path(ex, "link-test.md"))
hd  <- Episode$new(fs::path(ex, "validation-headings.md"))

Links

The links example has several mistakes covering our possibilities:

lnk$show()
lnk$validate_links()

Headings

The headings example also has several mistakes, and demonstrates the value of having a visual heading tree displayed on output when verbose = TRUE

hd$show()
hd$validate_headings(verbose = TRUE)


carpentries/pegboard documentation built on Nov. 13, 2024, 8:53 a.m.