Travis-CI Build Status AppVeyor Build Status Coverage Status

pubcrawl

Convert 'epub' Files to Text

Description

Convert 'epub' Files to Text

The 'epub' file format is really just a structured 'ZIP' archive with metadata, graphics and (usually) 'HTML' text. Tools are provided to turn an 'epub' file into a tidy data frame.

What's Inside The Tin

The following functions are implemented:

NOTE

There are edge cases I've totally not covered yet. Feel free to jump in and make this a real, useful package!

TODO

Installation

devtools::install_github("hrbrmstr/pubcrawl")
options(width=120)

Usage

library(pubcrawl)
library(tidyverse)

# current verison
packageVersion("pubcrawl")

An O'Reilly epub

epub_to_text("~/Data/R Packages.epub")

A Project Gutenberg epub that comes with the package

epub_to_text(system.file("extdat", "augustine.epub", package="pubcrawl")) %>% 
  mutate(path = abbreviate(path))

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.



hrbrmstr/pubcrawl documentation built on May 16, 2019, 7:25 a.m.