w3c_markup_validate: Validate Markup of Web Documents using W3C Markup Validation...

View source: R/w3c.R

w3c_markup_validateR Documentation

Validate Markup of Web Documents using W3C Markup Validation Services

Description

Check the markup validity of web documents in HTML, XHTML, etc., using a W3C Markup Validation service.

Usage

w3c_markup_validate(baseurl = w3c_markup_validate_baseurl(),
                    uri = NULL, file = NULL, string = NULL,
                    opts = list())

Arguments

baseurl

a character string giving the URL of the W3C Markup Validation service to employ.

uri

a character string giving the URI to validate.

file

a character string giving the path of a file to validate.

string

a character string with the markup to validate.

opts

a named list or curlOptions object with options to use for accessing the validation service via getURL (in case uri is given) or postForm (in case file or string are given).

Details

Exactly one of uri, file or string must be given.

Validation is then performed by using the W3C Markup Validation service at the given URL, using the (still declared “experimental”) SOAP 1.2 API of such a service (see https://validator.w3.org/docs/api.html for more information).

If a SOAP validation response could be obtained, w3c_markup_validate() returns the information in the response organized into an object of class "w3c_markup_validate", which is a list with the following elements:

valid

a logical indicating the validity of the web document checked (TRUE iff there were no errors)

errorcount

an integer giving the number of errors found.

errors

a data frame with variables ‘line’, ‘col’, ‘message’, ‘messageid’, ‘explanation’ and ‘source’ with the obvious meanings, or NULL.

warningcount

an integer giving the number of warnings found.

warnings

a data frame with variables as for errors, or NULL.

This class has methods for print for compactly summarizing the results, an inspect method for inspecting details, and an as.data.frame method for collapsing the errors and warnings into a “flat” data frame useful for further analyses.

Note

The validation service provided by the W3C used by default for validation is a shared and free resource, and the W3C asks (see https://validator.w3.org/docs/api.html) for considerate use and possibly installing a local instance of the validation service: excessive use of the service will be blocked. In fact, it seems that since May 2015 W3C blocks access to the SOAP API, so one needs to use a different (local) validation service.

On Debian-based systems, a local instance can conveniently be installed via the system command apt-get install w3c-markup-validator and following the instructions for providing the validator as a web service.

One can use the environment variable W3C_MARKUP_VALIDATOR_BASEURL to specify the service to be employed by default. E.g., one can set this to "http://localhost/w3c-validator/check" for Debian-based systems as discussed above.

See Also

w3c_markup_validate_baseurl for getting and setting the URL of the validation service.

w3c_markup_validate_db for combining and analyzing collections of single validation results.

Examples

## Not much to show with this as it should validate ok
## (provided that the validation service is accessible):
tryCatch(w3c_markup_validate(uri = "https://CRAN.R-project.org"),
         error = identity)

W3CMarkupValidator documentation built on Feb. 16, 2023, 7:09 p.m.