cleanNLP: A Tidy Data Model for Natural Language Processing

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, a Python back end with 'spaCy' <> or the Java back end 'CoreNLP' <>. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, entity linking, sentiment analysis, dependency parsing, coreference resolution, and word embeddings. Summary statistics regarding token unigram, part of speech tag, and dependency type frequencies are also included to assist with analyses.

Package details

AuthorTaylor B. Arnold [aut, cre]
MaintainerTaylor B. Arnold <[email protected]>
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the cleanNLP package in your browser

Any scripts or data that you put into this service are public.

cleanNLP documentation built on May 2, 2019, 12:11 p.m.