tm_filter: Filter and Index Functions on Corpora

Description Usage Arguments Value Examples

View source: R/filter.R

Description

Interface to apply filter and index functions to corpora.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## S3 method for class 'PCorpus'
tm_filter(x, FUN, ...)
## S3 method for class 'SimpleCorpus'
tm_filter(x, FUN, ...)
## S3 method for class 'VCorpus'
tm_filter(x, FUN, ...)
## S3 method for class 'PCorpus'
tm_index(x, FUN, ...)
## S3 method for class 'SimpleCorpus'
tm_index(x, FUN, ...)
## S3 method for class 'VCorpus'
tm_index(x, FUN, ...)

Arguments

x

A corpus.

FUN

a filter function taking a text document or a string (if x is a SimpleCorpus) as input and returning the logical value TRUE or FALSE.

...

arguments to FUN.

Value

tm_filter returns a corpus containing documents where FUN matches, whereas tm_index only returns the corresponding indices.

Examples

1
2
3
data("crude")
# Full-text search
tm_filter(crude, FUN = function(x) any(grep("co[m]?pany", content(x))))

Example output

Loading required package: NLP
<<VCorpus>>
Metadata:  corpus specific: 0, document level (indexed): 0
Content:  documents: 5

tm documentation built on July 12, 2020, 3 p.m.