process_document: Tokenize text using spaCy

process_documentR Documentation

Tokenize text using spaCy

Description

Tokenize text using spaCy. The results of tokenization is stored as a Python object. To obtain the tokens results in R, use get_tokens(). https://spacy.io.

Usage

process_document(x, multithread, ...)

Arguments

x

input text functionalities including the tagging, named entity recognition, dependency analysis. This slows down spacy_parse() but speeds up the later parsing. If FALSE, tagging, entity recognition, and dependency analysis when relevant functions are called.

multithread

logical;

...

arguments passed to specific methods

Value

result marker object

Examples

## Not run: 
spacy_initialize()
# the result has to be "tag() is ready to run" to run the following
txt <- c(text1 = "This is the first sentence.\nHere is the second sentence.", 
         text2 = "This is the second document.")
results <- spacy_parse(txt)

## End(Not run)

quanteda/spacyr documentation built on April 13, 2024, 2:27 p.m.