wikisource_book: Download a book from Wikisource

Description Usage Arguments Details Value Examples

View source: R/wikisource_book.R

Description

Download a book using the url of a Wikisource content page into a data frame. The Wikisource table of content page should link to all the Wikisource pages constituting the book. The text in the Wikisource pages is downloaded using the wikisource_page() function.

Usage

1
wikisource_book(url, cleaned = TRUE)

Arguments

url

A url of a Wikisource content page listing the pages constituting the book.

cleaned

A boolean variable for cleaning Wikisource pages.

Details

The download could fail if the Wikisource paths listed into content page strongly differ from the url path of the content page.

Value

A five column tbl_df (a type of data frame; see tibble or dplyr packages) with one row for each line of the text or texts, with columns.

text

A character column

title

A character column with the title of the Wikisource summary page

page

Integer column with a number for the text from each Wikisource page downloaded

language

A character column with a two letter string refering the language of the text

url

A character column with the url of the Wikisource page of the text

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 

# download Voltaire's "Candide"
wikisource_book("https://en.wikisource.org/wiki/Candide")

# download "Candide" in French and Spanish
library(purrr)

fr <- "https://fr.wikisource.org/wiki/Candide,_ou_l%E2%80%99Optimisme/Garnier_1877"
es <- "https://es.wikisource.org/wiki/C%C3%A1ndido,_o_el_optimismo"
books <- map_df(c(fr, es), wikisource_book)

## End(Not run)

lgnbhl/wikisourcer documentation built on Oct. 4, 2020, 11:10 p.m.