tm_map: Transformations on Distributed Corpora

Description Usage Arguments Value See Also Examples

Description

Interface to apply transformation functions to distributed corpora. See tm_map in tm for more information.

Usage

1
2
## S3 method for class 'DCorpus'
tm_map(x, FUN, ...)

Arguments

x

A distributed corpus of class DCorpus.

FUN

a transformation function taking a text document as input and returning a text document. The function content_transformer can be used to create a wrapper to get and set the content of text documents.

...

arguments to FUN.

Value

A DCorpus with FUN applied to each document in x. If revisions are enabled, the original documents contained in x can be retrieved via getting back to the corresponding revision using the function setRevision().

See Also

getTransformations for available transformations in package tm.

Examples

1
2
data("crude")
tm_map(as.DCorpus(crude), content_transformer(tolower))

Example output

Loading required package: DSL
Loading required package: tm
Loading required package: NLP
<<DCorpus>>
Metadata:  corpus specific: 0, document level (indexed): 0
Content:  documents: 20

tm.plugin.dc documentation built on Nov. 29, 2020, 5:07 p.m.