Associated Press data from the First Text Retrieval Conference (TREC-1) 1992, which has being processed by stop-words removal, low-frequency words removal and short documents removal.
The data set is an object of class "
simple_triplet_matrix" provided by package slam.
It is a word-document matrix which contains the term frequency of 7000 words in 2134 documents.
Harman, D. (1992, November). Overview of the First Text REtrieval Conference (TREC-1). In TREC (Vol. 1992, pp. 1-20).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.