parser: Parse Sentences

Description Usage Arguments Value Examples

Description

A wrapper for NLP,/openNLP's named sentence parsing tools.

Usage

1
2
3
4
5
parser(text.var, engine = "openNLP",
  parse.annotator = easy_parse_annotator(),
  word.annotator = word_annotator(), java.path = "java",
  element.chunks = floor(2000 * (23.5/mean(sapply(text.var, nchar), na.rm =
  TRUE))))

Arguments

text.var

The text string variable.

engine

The backend pat of speech tagger, either "openNLP" or "coreNLP". The default "openNLP" uses the openNLP package. If the user has the Stanford CoreNLP suite (‘http://stanfordnlp.github.io/CoreNLP/’) installed this can be used as the tagging backend instead.

parse.annotator

A parse annotator. See ?parse_annotator. Due to Java memory allocation limits the user must generate the annotator and supply it directly to parser (only used if lib = "openNLP").

word.annotator

A word annotator (only used if lib = "openNLP").

java.path

The path the where Java is located (only used if lib = "coreNLP").

element.chunks

The number of elements to include in a chunk. Chunks are passed through an lapply and size is kept within a tolerance because of memory allocation in the tagging process with Java.

Value

Returns a list of character vectors of parsed sentences.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
## Not run: 
txt <- c(
    "Really, I like chocolate because it is good. It smells great.",
    "Robots are rather evil and most are devoid of decency.",
    "He is my friend.",
    "Clifford the big red dog ate my lunch.",
    "Professor Johns can not teach",
    "",
    NA
)

## openNLP parser
if(!exists('parse_ann')) {
    parse_ann <- parse_annotator()
}

(x2 <- parser(txt, engine = 'openNLP',  parse.annotator = parse_ann))
dev.new()
par(
    mfrow = c(3, 2),
    mar = c(0,0,1,1) + 0.1
)
frame(); text(.5, .5, "openNLP", cex=2)
lapply(x2[1:5], plot)

## coreNLP parser
(x <- parser(txt, engine = "coreNLP"))

par(mar = c(0,0,0,.7) + 0.2)
plot(x[[2]])
par(
    mfrow = c(3, 2),
    mar = c(0,0,1,1) + 0.1
)
frame(); text(.5, .5, "coreNLP", cex=2)
lapply(x[1:5], plot)

## End(Not run)

trinker/parsent documentation built on May 31, 2019, 9:41 p.m.