validate_links: Validate Links in a markdown document

View source: R/validate_links.R

validate_linksR Documentation

Description

This function will validate that links do not throw an error in markdown documents. This will include links to images and will respect robots.txt for websites.

Usage

validate_links(yrn)

allowed_uri_protocols

link_known_protocol(VAL)

link_enforce_https(VAL)

link_all_reachable(VAL)

link_img_alt_text(VAL)

link_length(VAL)

link_descriptive(VAL)

link_source_list(lt)

link_internal_anchor(VAL, source_list, headings, body)

link_internal_file(VAL, source_list, root)

link_internal_well_formed(VAL, source_list)

link_tests

link_info

Arguments

yrn

a tinkr::yarn or Episode object.

lt

the output of make_link_table()

source_list

output of link_source_list

headings

an xml_nodeset of headings

body

an xml_document

root

the root path to the folder containing the file OR containing the paths to the ultimate parent files.

Format

  • allowed_uri_protocols a character string of length 23

  • link_tests a character string of length 9 containing templates that use the output of validate_links() for formatting.

  • link_info a character string of length 9 that gives information and informative links for additional context for failures.

Details

Link Validity

All links must resolve to a specific location. If it does not exist, then the link is invalid. At the moment, we can only do local links.

External links

These links must start with a valid and secure protocol. Allowed protocols are taken from the allowed protocols in Wordpress: http, https, ftp, ftps, mailto, news, irc, irc6, ircs, gopher, nntp, feed, telnet, mms, rtsp, sms, svn, tel, fax, xmpp, webcal, urn

Misspellings and unsupported protocols (e.g. ⁠javascript:⁠ and ⁠bitcoin:⁠ will be flagged).

In addition, we will enforce the use of HTTPS over HTTP.

Cross-lesson links

These links will have no protocol, but should resolve to the HTML version of a page and have the correct capitalisation.

Anchors (aka fragments)

Anchors are located at the end of URLs that start with a ⁠#⁠ sign. These are used to indicate a section of the documenation or a span id.

Accessibility (a11y)

Accessibillity ensures that your links are accurate and descriptive for people who have slow connections or use screen reader technology.

Alt-text (for images)

All images must have associated alt-text. In pandoc, this is acheived by writing the alt attribute in curly braces after the image: ⁠![image caption](link){alt='alt text'}⁠: https://webaim.org/techniques/alttext/

Descriptive text

All links must have descriptive text associated with them, which is beneficial for screen readers scanning the links on a page to not have a list full of "link", "link", "link": https://webaim.org/techniques/hypertext/link_text#uninformative

Text length

Link text length must be greater than 1: https://webaim.org/techniques/hypertext/link_text#link_length

Value

a data frame with parsed information from xml2::url_parse() and columns of logical values indicating the tests that passed.

Note

At the moment, we do not currently test if all links are reachable. This is a feature planned for the future.

This function is internal. Please use the methods for the Episode and Lesson classes.

See Also

Episode and Lesson for the methods that will throw warnings

Examples

l <- Lesson$new(lesson_fragment())
e <- l$episodes[[3]]
# Our link validators run a series of tests on links and images and return a 
# data frame with information about the links (via xml2::url_parse), along 
# with the results of the tests
v <- asNamespace('pegboard')$validate_links(e)
names(v)
v
# URL protocols -----------------------------------------------------------
# To avoid potentially malicious situations, we have an explicit list of
# allwed URI protocols, which can be found in the `allowed_uri_protocols`
# character vector:
asNamespace('pegboard')$allowed_uri_protocols
# note that we make an additional check for the http protocol.

# Creating Warnings from the table ----------------------------------------
# The validator does not produce any warnings or messages, but this data
# frame can be passed on to other functions that will throw them for us. We
# have a function that will throw a warning/message for each link that
# fails the tests. These messages are controlled by `link_tests` and 
# `link_info`.
asNamespace('pegboard')$link_tests
asNamespace('pegboard')$link_info
asNamespace('pegboard')$throw_link_warnings(v)

carpentries/pegboard documentation built on Nov. 13, 2024, 8:53 a.m.