consecutivePoses: Find clusters of consecutive posees
In aazaff/geocarrot: R Interface for GeoDeepDive Library

Description Usage Arguments Details Value Author(s) Examples

View source: R/consecutivePoses.R

Returns tuples of GeoDeepDive docid, sentid, and sets of consecutive parts of speech (i.e., poses).

1	consecutivePoses(Sentence, Pose = "NNP")

`Sentence`	a GeoDeepDive output nlp output record
`Pose`	a string

This function will find groupings of parts of speech that occur consecutively within a sentence using the poses output of the StanfordCoreNLP. It primarily makes sense to look for clusters of proper nouns ("NNP") to extract entities with multi-word names (e.g., people, places, organizations). However, you can look for other types of pose as well. The format is a matrix of tuples of GeoDeepDive docid, sentid, and the cluster of poses.

A character matrix

Andrew A. Zaffos & Erika T. Ito

# Unlock the example dataset
nlp_output<-data(usgs_gdd)

# Extract all clusters of proper nouns across all documents
ProperNouns<-apply(usgs_gdd,1,consecutivePoses,"NNP")
# Collapse back into a character matrix
ProperMatrix<-do.call(rbind,ProperNouns)

# Extract all adjectives in a single sentence
Adjectives<-consecutivePoses(usgs_gdd[350,],"JJ")