A Tidy Data Model for Natural Language Processing

cleanNLP-packagecleanNLP: A Tidy Data Model for Natural Language Processing
cnlp_annotateRun the annotation pipeline on a set of documents
cnlp_combine_documentsCombine a set of annotations
cnlp_download_corenlpDownload java files needed for CoreNLP
cnlp_download_udpipeDownload model files needed for udpipe
cnlp_extract_documentsExtract documents from an annotation object
cnlp_get_coreferenceAccess coreferences from an annotation object
cnlp_get_dependencyAccess dependencies from an annotation object
cnlp_get_documentAccess document meta data from an annotation object
cnlp_get_entityAccess named entities from an annotation object
cnlp_get_sentenceAccess sentence-level annotations
cnlp_get_tokenAccess tokens from an annotation object
cnlp_get_vectorAccess word embedding vector from an annotation object
cnlp_init_corenlpInterface for initializing the corenlp backend
cnlp_init_spacyInterface for initializing the spacy backend
cnlp_init_tokenizersInterface for initializing the tokenizers backend
cnlp_init_udpipeInterface for initializing the udpipe backend
cnlp_quickQuickly Compute Data Frame of Annotations
cnlp_read_conllReads a CoNLL-U or CoNLL-X File
cnlp_read_csvRead annotation files from disk
cnlp_utils_pcaCompute Principal Components and store as a Data Frame
cnlp_utils_phraseRun the annotation pipeline on a set of documents
cnlp_utils_tfidfConstruct the TF-IDF Matrix from Annotation or Data Frame
cnlp_write_conllReturns a CoNLL-U Document
cnlp_write_csvWrite annotation files to disk
dep_frequencyUniversal Dependency Frequencies
obamaAnnotation of Barack Obama's State of the Union Addresses
pos_frequencyUniversal Part of Speech Code Frequencies
print.annotationPrint a summary of an annotation object
renamedRenamed functions
unUniversal Declaration of Human Rights
word_frequencyMost frequent English words
