document_term_casters: Casting a data frame to a DocumentTermMatrix,...

Description Usage Arguments Details

Description

This turns a "tidy" one-term-per-document-per-row data frame into a DocumentTermMatrix or TermDocumentMatrix from the tm package, or a dfm from the quanteda package. These functions support non-standard evaluation through the tidyeval framework. Groups are ignored.

Usage

1
2
3
4
5
cast_tdm(data, term, document, value, weighting = tm::weightTf, ...)

cast_dtm(data, document, term, value, weighting = tm::weightTf, ...)

cast_dfm(data, document, term, value, ...)

Arguments

data

Table with one-term-per-document-per-row

term

Column containing terms as string or symbol

document

Column containing document IDs as string or symbol

value

Column containing values as string or symbol

weighting

The weighting function for the DTM/TDM (default is term-frequency, effectively unweighted)

...

Extra arguments passed on to sparseMatrix

Details

The arguments term, document, and value are passed by expression and support quasiquotation; you can unquote strings and symbols.


tidytext documentation built on Nov. 18, 2017, 9:03 a.m.