nlp: 'nlp' dataset from GeoDeepDive

Description Usage Format

Description

A dataset returned from a query to GeoDeepDive (https://geodeepdive.org/) that includes Natural Language Processing elements from the Stanford NLP tools (https://nlp.stanford.edu/).

Usage

1

Format

A data.frame with 87,181 rows and 9 columns.

_gddid

Unique identifier for the article within the GDD database.

sentence

Unique sentence index within the article.

wordIndex

Unique index of unique words within the sentences.

word

Sentence within the article, split by commas.

partofspeech

Parts of Speech from the Stanford tagger, matching the Penn State Treebank tags: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

specialclass

Special classes (numbers, dates, &cetera)

wordsAgain
wordtype

Word types, based on universal dependencies (http://universaldependencies.org/introduction.html).

wordmodified

The word (from the word index) modified by the typed word.

@source https://geodeepdive.org/


EarthCubeGeochron/geodiveR documentation built on May 25, 2019, 8:29 p.m.