Description Usage Arguments Examples
View source: R/pc_extract_from_web.R
Facilitates extracting strings from web pages
1 2 3 4 5 6 7 8 9 |
url |
A characther vector of length one. URL or path to local html file. |
container |
Defaults to NULL. If provided, it must be an html element such as "div", "span", etc. |
container_class |
Defaults to NULL. If provided, also 'container' must be given (and 'container_id' must be NULL). Only text found inside the provided combination of container/class will be extracted. |
container_id |
Defaults to NULL. If provided, also 'container' must be given (and 'container_class' must be NULL). Only text found inside the provided combination of container/class will be extracted. |
container_instance |
Defaults to NULL. If given, it must be an integer. If a given element is found more than once in the same page, it keeps only the relevant occurrence for further extraction. |
subelement |
Defaults to NULL. If provided, also 'container' must be given. Only text within elements of given type under the chosen combination of container/container_class will be extracted. When given, it will tipically be "p", to extract all p elements inside the selected div. |
no_children |
Defaults to FALSE, i.e. by default all subelements of the selected combination (e.g. div with given class) are extracted. If TRUE, only text found under the given combination (but not its subelements) will be extracted. Corresponds to the xpath string '/node()[not(self::div)]'. |
1 2 3 4 | ## Not run:
title <- pc_extract_from_web(url = "https://www.europeandatajournalism.eu/eng/News/Data-news/The-price-of-coastal-flood-mitigation-in-Europe", container = "h1")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.