View source: R/get_wiki_content.R
get_wiki_content | R Documentation |
A wrapper around WikipediR::get_page_content with some soft cleaning of the content and automatic handling of pages with a redirect (see details section). Furthmore, the function does not stop if input includes eroneous page names and simply skips these.
get_wiki_content(page_names, language = "en", project = "wikipedia", rm_bracket_length = 50)
page_names |
The names of the Wiki pages to retreive content of (e.g., "Main_Page"). |
language |
By default |
project |
By default |
rm_bracket_length |
Maximum length (number of characters) of bracket content to be removed.
Edged brackets and enclosed content with equal or lower length are removed.
By default |
The content cleaning includes: - removal of "non-text" sections (See_also|Notes_and_References|Notes|References|Further_reading|External_links) - removal of html tags, line breaks, reference and other bracket content (e.g., [edit]) - harmonization of blanks
If a redirect page is hit, the function simply discards this page (and associated name) and turns to the redirected page.
A character vector with Wiki content.
content = get_wiki_content(c("S_(programming_language)", "Eco-sufficiency", "Energy star ratings")) # [1] "S_(programming_language)" # [1] "Eco-sufficiency" # [1] "Energy star ratings" # [1] "Energy Star" # notice that "energy star ratings" is actually a page with a redirect # the function replaces it by the respective redirect page str(content) # Named chr [1:3] "SParadigm multi-paradigm: imperative, object orientedDeveloper Rick Becker, Allan Wilks, John ChambersFirst"| __truncated__ ... # - attr(*, "names")= chr [1:3] "S_(programming_language)" "Eco-sufficiency" "Energy star ratings"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.