xml_extract: XML Extract.

Description Usage Arguments Details Value See Also

Description

Extract XML text or an XML attribute via XPath.

Usage

1
2
xml_extract(x, xpath, extract_type, extract_value = NULL,
  ret_var_name = NULL)

Arguments

x

XML document: a literal XML document, a URL, or a string.

xpath

A string containing a xpath (1.0) expression.

extract_type

The string "text" or "attr" selecting the type of extraction.

extract_value

The attribute value to be extracted. This only needs to be set if extract_type = "attr".

ret_var_name

The variable name of the extracted value. xml_extract may be used as part of an assignment statement. If this is the case, then the parameter ret_var_name should remain NULL. But if xml_extract is to be used as part of a functional pipeline then it may be necessary to name the returned value.

Details

May be used as a UDF that is part of a dplyr pipeline. The most simple use is to include xml_extract as part of a dplyr::mutate function. For details see vignette("chunked-invoke-rows-xml").

A more complex use would be to use as a UDF to parse an arbitrary number of text and attribute values from an XML document. This can be accomplished utilizing a dataframe holding parameter values and purrr::pmap.

Because xml_find_first is the function utilized in xml_extract, errors are consumed. This is helpful when iterating over a set of XML documents where the schemas are inconsistent.

Suggested resources for XPath are

Value

The extracted text or attribute value from an XML tag.

See Also

See vignette("chunked-invoke-rows") for usage.


curtisalexander/CRAmisc documentation built on May 14, 2019, 12:52 p.m.