brown: Brown Corpus
In francojc/langdata: Practice Language Datasets

Description Usage Format Source

A dataset containing the 1,155,866 tokenized words for 15 genre categories of a sample of American English.

brown

A data frame with 223,506 rows and 11 variables:

document_id: ID for each corpus document
category: Label code for each of the 15 corpus categories
category_description: Description label for the corpus categories
words: Tokenized words from the corpus
pos: Part of speech label for each word in the corpus

http://www.nltk.org/nltk_data/

francojc/langdata documentation built on May 31, 2019, 2:48 p.m.

francojc/langdata index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com