README.md
In francojc/langdata: Practice Language Datasets

Language Data

A set of language datasets and the code that creates them. These datasets provide a starting point for data visualization, transformation and analysis.

Install from GitHub with devtools::install_github("francojc/langdata").

Switchboard Dialog Act Corpus

A dataset containing a corpus of spontaneous conversations from 440 speakers of American English in 1,115 individual conversations. Original corpus files and documentation from the Linguistic Data Consortium is available here.

Brown Corpus

A dataset containing the 1,155,866 tokenized words for 15 genre categories of a sample of American English. Original corpus files and documentation from the Natural Language Toolkit data repository is available here.

...

francojc/langdata documentation built on May 31, 2019, 2:48 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com