Home

/

GitHub

/

In jkeast/wikitablr: Simple Reader for Wikipedia's Tables

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

wikitablr

wikitablr is an R package that has the tools to simply webscrape tables from wikipedia, and clean for common formatting issues. The intention here is to empower beginners to explore data on practically any subject that interests them (as long as there's a wikipedia table on it), but anyone can utilize this package.

wikitablr takes data that looks like this:

library(wikitablr)
head(read_wiki_raw("https://en.wikipedia.org/wiki/List_of_songs_recorded_by_the_Beatles", 2))

and makes it look like this:

head(read_wiki_table("https://en.wikipedia.org/wiki/List_of_songs_recorded_by_the_Beatles", 2))

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("jkeast/wikitablr")

Example

wikitablr allows you to either read in and clean your code all at once:

#read in first table on page
head(read_wiki_table("https://en.wikipedia.org/wiki/List_of_colleges_and_universities_in_Massachusetts"))

Or use its cleaning functions seperately:

#read in first table from url without cleaning it
example <- read_wiki_raw("https://en.wikipedia.org/wiki/List_of_colleges_and_universities_in_Massachusetts")
head(example)

#clean names of table
example <- clean_wiki_names(example)

head(example)

#remove footnotes from table
head(remove_footnotes(example))

Read all tables

wikitablr also allows you to read in and clean all tables on a page at once, putting them into a list.

This is the first table on the wikipedia page "List of World Series champions":

example2 <- read_all_tables("https://en.wikipedia.org/wiki/List_of_World_Series_champions")

head(example2[[1]])

and the rest:

head(example2[[2]])

head(example2[[3]])

head(example2[[4]])

jkeast/wikitablr documentation built on March 7, 2020, 7:48 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jkeast/wikitablr
Simple Reader for Wikipedia's Tables

In jkeast/wikitablr: Simple Reader for Wikipedia's Tables

wikitablr

Installation

Example

Read all tables

R Package Documentation

Browse R Packages

We want your feedback!

jkeast/wikitablr Simple Reader for Wikipedia's Tables

In jkeast/wikitablr: Simple Reader for Wikipedia's Tables

wikitablr

Installation

Example

Read all tables

R Package Documentation

Browse R Packages

We want your feedback!

jkeast/wikitablr
Simple Reader for Wikipedia's Tables