named_entity: Named Entity Recognition

Description Usage Arguments Value See Also Examples

Description

A wrapper for NLP,/openNLP's named entity recognition annotation tools.

Usage

1
2
3
named_entity(text.var, entity.annotator, word.annotator = word_annotator(),
  element.chunks = floor(2000 * (23.5/mean(sapply(text.var, nchar), na.rm =
  TRUE))))

Arguments

text.var

The text string variable.

entity.annotator

A character vector identifying an entity recognition annotator (c("person_annotator", "location_annotator", "date_annotator", "money_annotator", "percent_annotator"). See ?annotators.

word.annotator

A word annotator.

element.chunks

The number of elements to include in a chunk. Chunks are passed through an lapply and size is kept within a tolerance because of memory allocation in the tagging process with Java.

Value

Returns a data.frame of named entities and frequencies.

See Also

Maxent_Entity_Annotator

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Not run: 
data(presidential_debates_2012)

peoples <- named_entity(presidential_debates_2012$dialogue, 'person_annotator')
unlist(peoples)
plot(peoples)

orgs <-named_entity(presidential_debates_2012$dialogue, 'organization_annotator')
unlist(orgs)

dates <-named_entity(presidential_debates_2012$dialogue, 'date_annotator')
unlist(dates)

library(dplyr)
presidential_debates_2012$organizations <- named_entity(
    presidential_debates_2012$dialogue,
    'organization_annotator'
)

presidential_debates_2012 %>%
    {.[!sapply(.$organizations, is.null), ]} %>%
    rowwise() %>%
    mutate(organizations = paste(organizations, collapse=", ")) %>%
    select(person, time, organizations)

library(tidyr)
presidential_debates_2012 %>%
    {.[!sapply(.$organizations, is.null), ]} %>%
    unnest() %>%
    select(person, time, organizations)

## End(Not run)

trinker/entity documentation built on May 31, 2019, 8:43 p.m.