parse: Parse CoreNLP output formats.

Description Usage Arguments Details

Description

Parse output from CoreNLP.

Usage

1
2
3
4
5
6
7
8
9
corenlp_parse_json(
  x,
  cols_to_keep = c("sentence", "index", "word", "pos", "ner"),
  output = NULL,
  logfile = NULL,
  progress = TRUE
)

corenlp_parse_xml(x)

Arguments

x

character vector, the JSON string(s) to be parsed

cols_to_keep

columns to keep

output

a destfile

logfile

a character string naming the file to an error log to; if provided, json strings will be written to this file if parsing the json string string fails

progress

logical

Details

The JSON results of applying the Stanford CoreNLP annotators can be written to a streaming JSON file (ndjson format). corenlp_parse_json will parse a json string to a data.frame. If output is specified, the output will be appended to the file provided. If output is NULL, a data.frame is returned. Strings that cannot be parsed are written to the logfile, if it is defined. If filename is present, the function will process one or more files with the output of Stanford CoreNLP in a NDJSON format. If the argument output has been defined during initialization, results are written/appended to that file. Otherwise, a data.frame is returned.


PolMine/bignlp documentation built on Jan. 29, 2021, 1:14 a.m.