organization_entity: Named Organization Recognition

Description Usage Arguments Value See Also Examples

Description

A wrapper for NLP,/openNLP's named organization recognition annotation.

Usage

1
2
3
organization_entity(text.var, entity.annotator = "organization_annotator",
  word.annotator = word_annotator(), element.chunks = floor(2000 *
  (23.5/mean(sapply(text.var, nchar), na.rm = TRUE))))

Arguments

text.var

The text string variable.

entity.annotator

A character vector identifying an entity recognition annotator (c("person_annotator", "location_annotator", "date_annotator", "money_annotator", "percent_annotator"). See ?annotators.

word.annotator

A word annotator.

element.chunks

The number of elements to include in a chunk. Chunks are passed through an lapply and size is kept within a tolerance because of memory allocation in the tagging process with Java.

Value

Returns a data.frame of named entities and frequencies.

See Also

Other variable functions: date_entity, location_entity, money_entity, percent_entity, person_entity

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Not run: 
data(presidential_debates_2012)

orgs <- organization_entity(presidential_debates_2012$dialogue)
unlist(orgs)

library(dplyr)
presidential_debates_2012$organizations <- organization_entity(presidential_debates_2012$dialogue)

presidential_debates_2012 %>%
    {.[!sapply(.$organizations, is.null), ]} %>%
    rowwise() %>%
    mutate(organizations = paste(organizations, collapse=", ")) %>%
    select(person, time, organizations)

library(tidyr)
presidential_debates_2012 %>%
    {.[!sapply(.$organizations, is.null), ]} %>%
    unnest() %>%
    select(person, time, organizations)

## End(Not run)

trinker/entity documentation built on May 31, 2019, 8:43 p.m.