Get Relevant Part from Text Document. Search for the first and last occurence of items in Text Document, get sentence boundaries and extract text. Function can be used in tm_map wrapper

Share:

Description

Get Relevant Part from Text Document. Search for the first and last occurence of items in Text Document, get sentence boundaries and extract text. Function can be used in tm_map wrapper

Usage

1
2
  getRelevant(td, items, boundaries, matches.only = FALSE,
    fieldname = "matches")

Arguments

td

TextDocument

items

character vector of items to be searched, eg. c("Microsoft", "MSFT")

boundaries

defined sentence boundaries in perl-regex syntax

matches.only

Return number of matches only

matches

only should only matches be annotated, defaults to FALSE

fieldname

name which should be used for meta field in Text Document to be annotated

Author(s)

Mario Annau

See Also

tm_map