README.md

capek

An R Data Package with Karel Čapek's Novels

This package provides access to the full texts of six Czech novels of Karel Čapek, Czech writer best known for his play R.U.R. which introduced the word robot. It is more than just inspired by Julia Silge's janeaustenr package. The package is intended to provide non-english corpus for an experimenting with tidy text analysis.

The plain text for each novel has been downloaded from Municipal Library of Prague:

There is also a function capek_books() that returns a tidy data frame of all 6 novels.

Installation

To install the package from Github, use the following:

library(devtools)
install_github("simecek/capek")
library(capek)

Usage

library(capek)
library(dplyr)

capek_books() %>%
     group_by(book) %>%
     summarise(total_lines = n())


simecek/capek documentation built on May 20, 2019, 2:01 p.m.