View source: R/scraping_helpers.R
| node_which | R Documentation |
xml_nodeset that match a regex patternFind the positions of nodes in a xml_nodeset that match a regex pattern
node_which(nodelist, regex, inc = 0)
nodelist |
|
regex |
|
inc |
|
Returns a numeric scalar or vector.
library(rvest)
library(scrapurrr)
# Lets suppose we want to know the owner of "Alfreds Futterkiste":
html = "<table>
<tr>
<th>Company</th>
<th>Contact</th>
<th>Country</th>
</tr>
<tr>
<td>Alfreds Futterkiste</td>
<td>Maria Anders</td>
<td>Germany</td>
</tr>
<tr>
<td>Centro comercial Moctezuma</td>
<td>Francisco Chang</td>
<td>Mexico</td>
</tr>
</table>" %>%
read_html()
# Searching for `td` elements returns a list:
html_elements(x = html, "td")
# Of course we could match by position, but it may not be fixed if we have
# many tables. Let's use `node_which()`. since the "owner" is always two rows
# behind the "company" we increment by 2:
html_elements(x = html, "td") %>%
node_which("Alfreds Futterkiste", inc = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.