matrix: Term-Document Matrix from Distributed Corpora

Description Usage Arguments Value See Also Examples

Description

Constructs a term-document matrix given a distributed corpus.

Usage

1
2
## S3 method for class 'DCorpus'
TermDocumentMatrix(x, control = list())

Arguments

x

A distributed corpus.

control

A named list of control options. The component weighting must be a weighting function capable of handling a TermDocumentMatrix. It defaults to weightTf for term frequency weighting. All other options are delegated internally to a termFreq call.

Value

An object of class TermDocumentMatrix containing a sparse term-document matrix. The attribute Weighting contains the weighting applied to the matrix.

See Also

The documentation of termFreq gives an extensive list of possible options.

TermDocumentMatrix

Examples

1
2
3
4
data("crude")
tdm <- TermDocumentMatrix(as.DCorpus(crude),
                          list(stopwords = TRUE, weighting = weightTfIdf))
inspect(tdm[149:152,1:5])

tm.plugin.dc documentation built on Nov. 29, 2020, 5:07 p.m.