gutenberg_metadata: Gutenberg metadata about each work

Description Usage Format Details See Also Examples

Description

Selected fields of metadata about each of the Project Gutenberg works. These were collected using the gitenberg Python package, particularly the pg_rdf_to_json function.

Usage

1

Format

A tbl_df (see tibble or dplyr) with one row for each work in Project Gutenberg and the following columns:

gutenberg_id

Numeric ID, used to retrieve works from Project Gutenberg

title

Title

author

Author, if a single one given. Given as last name first (e.g. "Doyle, Arthur Conan")

author_id

Project Gutenberg author ID

language

Language ISO 639 code, separated by / if multiple. Two letter code if one exists, otherwise three letter. See https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes

gutenberg_bookshelf

Which collection or collections this is found in, separated by / if multiple

rights

Generally one of three options: "Public domain in the USA." (the most common by far), "Copyrighted. Read the copyright notice inside this book for details.", or "None"

has_text

Whether there is a file containing digits followed by .txt in Project Gutenberg for this record (as opposed to, for example, audiobooks). If not, cannot be retrieved with gutenberg_download

Details

To find the date on which this metadata was last updated, run attr(gutenberg_metadata, "date_updated").

See Also

gutenberg_works, gutenberg_authors, gutenberg_subjects

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
library(dplyr)
library(stringr)

gutenberg_metadata

gutenberg_metadata %>%
  count(author, sort = TRUE)

# look for Shakespeare, excluding collections (containing "Works") and translations
shakespeare_metadata <- gutenberg_metadata %>%
  filter(author == "Shakespeare, William",
         language == "en",
         !str_detect(title, "Works"),
         has_text,
         !str_detect(rights, "Copyright")) %>%
         distinct(title)

## Not run: 
shakespeare_works <- gutenberg_download(shakespeare_metadata$gutenberg_id)

## End(Not run)

# note that the gutenberg_works() function filters for English
# non-copyrighted works and does de-duplication by default:

shakespeare_metadata2 <- gutenberg_works(author == "Shakespeare, William",
                                         !str_detect(title, "Works"))

# date last updated
attr(gutenberg_metadata, "date_updated")

gutenbergr documentation built on May 19, 2017, 11:38 a.m.

Search within the gutenbergr package
Search all R packages, documentation and source code