cleanNLP: A Tidy Data Model for Natural Language Processing
Version 1.9.0

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of a Python back end with 'spaCy' or the Java back end 'CoreNLP' . A minimal back end with no external dependencies is also provided. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, entity linking, sentiment analysis, dependency parsing, coreference resolution, and word embeddings. Summary statistics regarding token unigram, part of speech tag, and dependency type frequencies are also included to assist with analyses.

Package details

AuthorTaylor B. Arnold [aut, cre]
Date of publication2017-05-27 15:08:04 UTC
MaintainerTaylor B. Arnold <>
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the cleanNLP package in your browser

Any scripts or data that you put into this service are public.

cleanNLP documentation built on May 30, 2017, 6:10 a.m.