cleanNLP-package | R Documentation |
Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Multiple NLP backends can be used, with the output standardized into a normalized format. Options include stringi (very fast, but only provides tokenization), udpipe (fast, many languages, includes part of speech tags and dependencies), and spacy (python backend; includes named entity recognition).
Once the package is set up, run one of cnlp_init_stringi
,
cnlp_init_spacy
, or cnlp_init_udpipe
to load
the desired NLP backend. After this function is done running, use
cnlp_annotate
to run the annotation engine over a corpus of
text. The package vignettes provide more detailed set-up information.
Useful links:
## Not run:
library(cleanNLP)
# load the annotation engine
cnlp_init_stringi()
# annotate your text
input <- data.frame(
text=c(
"This is a sentence.",
"Here is something else to parse!"
),
stringsAsFactors=FALSE
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.