corpus: Text Corpus Analysis
Version 0.10.0

Text corpus data analysis, with full support for international text (Unicode). Functions for reading data from newline-delimited 'JSON' files, for normalizing and tokenizing text, for searching for term occurrences, and for computing term occurrence frequencies, including n-grams.

Package details

AuthorPatrick O. Perry [aut, cph, cre], Finn Årup Nielsen [cph, dtc] (AFINN Sentiment Lexicon), Martin Porter and Richard Boulton [ctb, cph, dtc] (Snowball Stemmer and Stopword Lists), The Regents of the University of California [ctb, cph] (Strtod Library Procedure), Carlo Strapparava and Alessandro Valitutti [cph, dtc] (WordNet-Affect Lexicon), Unicode, Inc. [cph, dtc] (Unicode Character Database)
Date of publication2017-12-12 22:10:07 UTC
MaintainerPatrick O. Perry <[email protected]>
LicenseApache License (== 2.0) | file LICENSE
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the corpus package in your browser

Any scripts or data that you put into this service are public.

corpus documentation built on Dec. 13, 2017, 1:06 a.m.