consecutivePoses: Find clusters of consecutive posees

Description Usage Arguments Details Value Author(s) Examples

View source: R/consecutivePoses.R

Description

Returns tuples of GeoDeepDive docid, sentid, and sets of consecutive parts of speech (i.e., poses).

Usage

1
consecutivePoses(Sentence, Pose = "NNP")

Arguments

Sentence

a GeoDeepDive output nlp output record

Pose

a string

Details

This function will find groupings of parts of speech that occur consecutively within a sentence using the poses output of the StanfordCoreNLP. It primarily makes sense to look for clusters of proper nouns ("NNP") to extract entities with multi-word names (e.g., people, places, organizations). However, you can look for other types of pose as well. The format is a matrix of tuples of GeoDeepDive docid, sentid, and the cluster of poses.

Value

A character matrix

Author(s)

Andrew A. Zaffos & Erika T. Ito

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Unlock the example dataset
nlp_output<-data(usgs_gdd)

# Extract all clusters of proper nouns across all documents
ProperNouns<-apply(usgs_gdd,1,consecutivePoses,"NNP")
# Collapse back into a character matrix
ProperMatrix<-do.call(rbind,ProperNouns)

# Extract all adjectives in a single sentence
Adjectives<-consecutivePoses(usgs_gdd[350,],"JJ")

aazaff/geocarrot documentation built on May 5, 2019, 9:44 p.m.