xml_parse_data: Convert R parse data to XML
In xmlparsedata: Parse Data of 'R' Code as an 'XML' Tree

Description Usage Arguments Details Value See Also Examples

View source: R/package.R

In recent R versions the parser can attach source code location information to the parsed expressions. This information is often useful for static analysis, e.g. code linting. It can be accessed via the getParseData function.

1	xml_parse_data(x, includeText = NA, pretty = FALSE)

`x`	an expression returned from `parse`, or a function or other object with source reference information
`includeText`	logical; whether to include the text of parsed items in the result
`pretty`	Whether to pretty-indent the XML output. It has a small overhead which probably only matters for very large source files.

xml_parse_data converts this information to an XML tree. The R parser's token names are preserved in the XML as much as possible, but some of them are not valid XML tag names, so they are renamed, see the xml_parse_token_map vector for the mapping.

The top XML tag is <exprlist>, which is a list of expressions, each expression is an <expr> tag. Each tag has attributes that define the location: line1, col1, line2, col2. These are from the getParseData data frame column names.

See an example below. See also the README at https://github.com/r-lib/xmlparsedata#readme for examples on how to search the XML tree with the xml2 package and XPath expressions.

Note that 'xml_parse_data()' silently drops all control characters (0x01-0x1f) from the input, except horizontal tab (0x09) and newline (0x0a), because they are invalid in XML 1.0.

An XML string representing the parse data. See details below.

xml_parse_token_map for the token names. https://github.com/r-lib/xmlparsedata#readme for more information and use cases.

code <- "function(a = 1, b = 2) {\n  a + b\n}\n"
expr <- parse(text = code, keep.source = TRUE)

# The base R way:
getParseData(expr)

cat(xml_parse_data(expr, pretty = TRUE))

   line1 col1 line2 col2 id parent          token terminal     text
31     1    1     3    1 31      0           expr    FALSE         
1      1    1     1    8  1     31       FUNCTION     TRUE function
2      1    9     1    9  2     31            '('     TRUE        (
3      1   10     1   10  3     31 SYMBOL_FORMALS     TRUE        a
4      1   12     1   12  4     31     EQ_FORMALS     TRUE        =
5      1   14     1   14  5      6      NUM_CONST     TRUE        1
6      1   14     1   14  6     31           expr    FALSE         
7      1   15     1   15  7     31            ','     TRUE        ,
9      1   17     1   17  9     31 SYMBOL_FORMALS     TRUE        b
10     1   19     1   19 10     31     EQ_FORMALS     TRUE        =
11     1   21     1   21 11     12      NUM_CONST     TRUE        2
12     1   21     1   21 12     31           expr    FALSE         
13     1   22     1   22 13     31            ')'     TRUE        )
28     1   24     3    1 28     31           expr    FALSE         
15     1   24     1   24 15     28            '{'     TRUE        {
23     2    3     2    7 23     28           expr    FALSE         
17     2    3     2    3 17     19         SYMBOL     TRUE        a
19     2    3     2    3 19     23           expr    FALSE         
18     2    5     2    5 18     23            '+'     TRUE        +
20     2    7     2    7 20     22         SYMBOL     TRUE        b
22     2    7     2    7 22     23           expr    FALSE         
26     3    1     3    1 26     28            '}'     TRUE        }
<?xml version="1.0" encoding="UTF-8"standalone="yes" ?>
<exprlist>
  <expr line1="1" col1="1" line2="3" col2="1" start="26" end="76">
    <FUNCTION line1="1" col1="1" line2="1" col2="8" start="26" end="33">function</FUNCTION>
    <OP-LEFT-PAREN line1="1" col1="9" line2="1" col2="9" start="34" end="34">(</OP-LEFT-PAREN>
    <SYMBOL_FORMALS line1="1" col1="10" line2="1" col2="10" start="35" end="35">a</SYMBOL_FORMALS>
    <EQ_FORMALS line1="1" col1="12" line2="1" col2="12" start="37" end="37">=</EQ_FORMALS>
    <expr line1="1" col1="14" line2="1" col2="14" start="39" end="39">
      <NUM_CONST line1="1" col1="14" line2="1" col2="14" start="39" end="39">1</NUM_CONST>
    </expr>
    <OP-COMMA line1="1" col1="15" line2="1" col2="15" start="40" end="40">,</OP-COMMA>
    <SYMBOL_FORMALS line1="1" col1="17" line2="1" col2="17" start="42" end="42">b</SYMBOL_FORMALS>
    <EQ_FORMALS line1="1" col1="19" line2="1" col2="19" start="44" end="44">=</EQ_FORMALS>
    <expr line1="1" col1="21" line2="1" col2="21" start="46" end="46">
      <NUM_CONST line1="1" col1="21" line2="1" col2="21" start="46" end="46">2</NUM_CONST>
    </expr>
    <OP-RIGHT-PAREN line1="1" col1="22" line2="1" col2="22" start="47" end="47">)</OP-RIGHT-PAREN>
    <expr line1="1" col1="24" line2="3" col2="1" start="49" end="76">
      <OP-LEFT-BRACE line1="1" col1="24" line2="1" col2="24" start="49" end="49">{</OP-LEFT-BRACE>
      <expr line1="2" col1="3" line2="2" col2="7" start="53" end="57">
        <expr line1="2" col1="3" line2="2" col2="3" start="53" end="53">
          <SYMBOL line1="2" col1="3" line2="2" col2="3" start="53" end="53">a</SYMBOL>
        </expr>
        <OP-PLUS line1="2" col1="5" line2="2" col2="5" start="55" end="55">+</OP-PLUS>
        <expr line1="2" col1="7" line2="2" col2="7" start="57" end="57">
          <SYMBOL line1="2" col1="7" line2="2" col2="7" start="57" end="57">b</SYMBOL>
        </expr>
      </expr>
      <OP-RIGHT-BRACE line1="3" col1="1" line2="3" col2="1" start="76" end="76">}</OP-RIGHT-BRACE>
    </expr>
  </expr>
</exprlist>