Retrieve the page content of a random MediaWiki page

Share:

Description

wiki_page retrieves the DOM of a particular MediaWiki page, as a HTML blob inside a JSON object.

Usage

1
2
3
random_page(language = NULL, project = NULL, domain = NULL,
  namespaces = NULL, as_wikitext = FALSE, limit = 1,
  clean_response = FALSE, ...)

Arguments

language

The language code of the project you wish to query, if appropriate.

project

The project you wish to query ("wikiquote"), if appropriate. Should be provided in conjunction with language.

domain

as an alternative to a language and project combination, you can also provide a domain ("rationalwiki.org") to the URL constructor, allowing for the querying of non-Wikimedia MediaWiki instances.

namespaces

The namespaces to consider pages from. By default, pages from any namespace are considered; alternately, a numeric vector of accepted namespaces (which are described here) can be provided, and only pages within those namespaces will be considered.

as_wikitext

whether to retrieve the wikimarkup (TRUE) or the HTML (FALSE). Set to FALSE by default.

limit

the number of pages to return. 1 by default.

clean_response

whether to do some basic sanitising of the resulting data structure. Set to FALSE by default.

...

further arguments to pass to httr's GET.

See Also

page_content for retrieving the content of a specific page, revision_diff for retrieving 'diffs' between revisions, revision_content for retrieving the text of specified revisions.

Examples

1
2
3
4
5
#A page from Wikipedia
wp_content <- random_page("en","wikipedia")

#A page from the mainspace on Wikipedia
wp_article_content <- random_page("en","wikipedia", namespaces = 0)