View source: R/spacy_extract_nounphrases.R
| spacy_extract_nounphrases | R Documentation | 
This function extracts noun phrases from documents, based on the
noun_chunks attributes of documents objects parsed by spaCy (see
https://spacy.io/usage/linguistic-features#noun-chunks).
spacy_extract_nounphrases(
  x,
  output = c("data.frame", "list"),
  multithread = TRUE,
  ...
)
x | 
 a character object or a TIF-compliant corpus data.frame (see https://github.com/ropenscilabs/tif)  | 
output | 
 type of returned object, either   | 
multithread | 
 logical; If   | 
... | 
 unused  | 
When the option output = "data.frame" is selected, the
function returns a data.frame with the following fields.
textcontents of noun-phrase
root_textcontents of root token
start_idserial number ID of starting token. This number
corresponds with the number of data.frame returned from
spacy_tokenize(x) with default options.
root_idserial number ID of root token
lengthnumber of words (tokens) included in a noun-phrase (e.g.
for a noun-phrase, "individual car owners", length = 3)
either a list or data.frame of tokens
## Not run: 
spacy_initialize()
txt <- c(doc1 = "Natural language processing is a branch of computer science.",
         doc2 = "Paul earned a postgraduate degree from MIT.")
spacy_extract_nounphrases(txt)
spacy_extract_nounphrases(txt, output = "list")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.