A package for parsing, chucking and modifying wikimarkup in R.
Author: Oliver Keyes License: MIT Status: In development
Wikimarkup is the language used on Wikipedia and similar projects, and as such contains
a lot of valuable data both for scientists studying collaborative systems and people
studying things documented on or in Wikipedia.
mwparser parses wikimarkup, allowing a
user to filter down to specific types of tags such as links or templates, and then extract components of those tags.
library(mwparser) library(magrittr) wikitext <- "this is wikitext with \n [[a|link]] [[or|two]]" link_paths <- parse_wikitext(wikitext) %>% get_wikilinks %>% wikilink_paths(text = TRUE) link_paths  "a" "or"
# In the terminal pip install mwparserfromhell # In R install.packages("reticulate") devtools::install_github("ropenscilabs/mwparser")
With that, you're good to go!
The library currently has accessors to extract most common types of attribute and components from within them. The next step is exposing the rest of
mwparserfromhell's functionality, which includes:
Some time after that the goal is to integrate MediaWiki's actual parser, as a replacement for the
mwparserfromhell dependency, using piton.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.