gutenberg_strip: Strip header and footer content from a Project Gutenberg book

Description Usage Arguments Examples

Description

Strip header and footer content from a Project Gutenberg book. This is based on some formatting guesses so it may not be perfect. It will also not strip tables of contents, prologues, or other text that appears at the start of a book.

Usage

1

Arguments

text

A character vector with lines of a book

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(dplyr)
book <- gutenberg_works(title == "Pride and Prejudice") %>%
  gutenberg_download(strip = FALSE)

head(book$text, 10)
tail(book$text, 10)

text_stripped <- gutenberg_strip(book$text)

head(text_stripped, 10)
tail(text_stripped, 10)


Search within the gutenbergr package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.