extractPairs: extractPairs

Description Usage Arguments Details Value Examples

Description

Given text, extract entities that cooccur in a sentence

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
extractPairs(x, nerModel, ...)

## S3 method for class 'character'
extractPairs(x, nerModel, ...)

## S3 method for class 'tbl'
extractPairs(x, nerModel, toSub = NULL, requiredTerms = NULL,
  ignore.case = TRUE, ...)

## S3 method for class 'OrientDB'
extractPairs(x, nerModel, id = NULL, class, ...)

## S3 method for class 'es_conn'
extractPairs(x, nerModel, index = NULL, type = NULL,
  id = NULL, q = NULL, search = c("search", "scroll", "page"),
  scrollHold = "5m", size = 10, ...)

Arguments

x

Either a character vector with names of files or a tbl or database source holding the text. In the latter cases the text is assumed to be in a column named Text and the grouping variable is assumed to be named File.

nerModel

A ner model supplied by MITIE

toSub

Named vector where the elements are the pattern and the names are the replacement values

requiredTerms

A vector of terms that must be extracted if they exist

ignore.case

Logical indicating if requiredTerms is not case sensitive

id

List of IDs to read, if NULL it pulls every record

class

The document DB class from which to query

index

Document index

type

Type of document

q

Search query

search

Type of search: search, scroll or page

scrollHold

Time to hold open scroll state

size

Size of entry per shard

...

Further arguments

Details

Given text, extract entities that cooccur in a sentence. Text can be stored in files or in a column in a tbl or database.

Value

A tbl listing entity cooccurences along with the file name and the sentence number.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
ner_model_path <- "mitie/MITIE-models/english/ner_model.dat"
ner <- NamedEntityExtractor$new(ner_model_path)
textFiles <- file.path('data', 'NYTimes', dir('data/NYTimes/'))
extractPairs(textFiles, ner)

textDB <- src_sqlite('data/TimesDB/Articles.sqlite3')
textTable <- tbl(textDB, 'Articles')
textTable
extractPairs(textTable, ner)

## End(Not run)

jaredlander/TextInfo documentation built on May 18, 2019, 3:46 p.m.