knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

logo

Transform XML Documents with XSLT Stylesheets in R

The following functions are implemented:

You will need libxml2, libxslt, lxsltwrapp and lxmlwrapp installed. The first two are very apt-gettable or brew installable (and yummable). The latter two are best installed from http://vslavik.github.io/xmlwrapp/, and will eventually be included with the package as it matures. xmlwrapp is extremely lightweight and compiles well on linux and Mac OS X. There's a cygwin port here that may be of use on Window.

The package has been designed to work nicely with xml2 workflows as it accepts objects from xm2::read_xml and xml2::read_html and returns similar objects (if the XSLT output method is xml or html).

Unlike Sxslt and SXalan (both pretty much defunct with some fairly easy-to-generate memory bugs), xslt provides one real function: the ability to use XSLT stylesheets in XML processing workflows. This was born out of the desire to use XSLT transformations to extract just the salient text from scraped web sites, similar to the way readability works (but more for just clean text extraction versus making the text actually human readable and "pretty").

News

See CHANGELOG

Installation

devtools::install_github("hrbrmstr/xslt")
options(width=120)

Usage

library(xslt)

# current verison
packageVersion("xslt")

Test Results

library(xslt)
library(testthat)
library(xml2)

date()

xml_src <- "<test/>"
xslt_src <- '<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <article>
      <title>Hello World</title>
    </article>
  </xsl:template>
</xsl:stylesheet>'

doc <- read_xml(xml_src)
xsl <- read_xslt(xslt_src)

res <- xslt_transform(doc, xsl)
cat(as.character(res))


test_dir("tests/")

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.



hrbrmstr/xslt documentation built on May 17, 2019, 5:54 p.m.