View source: R/unescape_markup.R
unescape_markup | R Documentation |
This is a minor modification of http://stackoverflow.com/questions/5060076/convert-html-character-entity-encoding-in-r, and all credit is due.
This function will call either xml2::read_xml()
or xml2::read_html()
,
depending on the value passed to the argument. The default, if not specified, is html
.
If called with iconv_encoding == TRUE
, x is processed by iconv
,
which may or may not change x. In both the spirit of minimizing surprises, and with
particular note to the potential of an early return if no unescaping is
required, iconv_encoding is FALSE
by default, and therefore any args that
would be passed to iconv()
via ...
are ignored.
unescape_markup(x, what_ml = c("html", "xml"), iconv_encoding = FALSE, ...)
x |
A character; the input you wish to unescape |
what_ml |
One of |
iconv_encoding |
A logical vector of length 1. Should the input be processed via |
... |
Optional. Additional args to |
Useful when dealing with '< >' enclosed parts of strings in a vector
A character vector the same length of x, with <x>
unescaped. If no unescaping was
required, will return x as is, by default.
The xml2
functions this relies upon are not vectorized (this is a different use case, so
no criticism is implied re: the functions themselves). The actual function handles vector inputs of
length >1 through vapply()
, and should maintain a reasonable level of performance by first
subsetting only those elements of x where <.+>
is present. Therefore, if there are only
a few elements of x that require this function, performance should be acceptable; runtimes
will therefore increase on an as-needed basis, and not solely as a function of length(x)
.
x <- "<i>in-situ</i> electron microscopy" unescape_markup(x)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.