Description Usage Format Source
A mixed up collection of words from different book sections of two books.
1 |
A tibble with 108,657 observations, each a word on a document. This data set is designed to show how LDA can be used to separate a set of mixed documents into two distinct "topics" (or books).
wordWords from a given section within a book.
documentThe book section ID that the word came from.
Data taken from two books of the Gutenberg Project
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.