View source: R/scraping_helpers.R
node_which | R Documentation |
xml_nodeset
that match a regex patternFind the positions of nodes in a xml_nodeset
that match a regex pattern
node_which(nodelist, regex, inc = 0)
nodelist |
|
regex |
|
inc |
|
Returns a numeric scalar or vector
.
library(rvest) library(scrapurrr) # Lets suppose we want to know the owner of "Alfreds Futterkiste": html = "<table> <tr> <th>Company</th> <th>Contact</th> <th>Country</th> </tr> <tr> <td>Alfreds Futterkiste</td> <td>Maria Anders</td> <td>Germany</td> </tr> <tr> <td>Centro comercial Moctezuma</td> <td>Francisco Chang</td> <td>Mexico</td> </tr> </table>" %>% read_html() # Searching for `td` elements returns a list: html_elements(x = html, "td") # Of course we could match by position, but it may not be fixed if we have # many tables. Let's use `node_which()`. since the "owner" is always two rows # behind the "company" we increment by 2: html_elements(x = html, "td") %>% node_which("Alfreds Futterkiste", inc = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.