document_term_casters: Casting a data frame to a DocumentTermMatrix,...

cast_tdmR Documentation

Casting a data frame to a DocumentTermMatrix, TermDocumentMatrix, or dfm

Description

This turns a "tidy" one-term-per-document-per-row data frame into a DocumentTermMatrix or TermDocumentMatrix from the tm package, or a dfm from the quanteda package. These functions support non-standard evaluation through the tidyeval framework. Groups are ignored.

Usage

cast_tdm(data, term, document, value, weighting = tm::weightTf, ...)

cast_dtm(data, document, term, value, weighting = tm::weightTf, ...)

cast_dfm(data, document, term, value, ...)

Arguments

data

Table with one-term-per-document-per-row

term

Column containing terms as string or symbol

document

Column containing document IDs as string or symbol

value

Column containing values as string or symbol

weighting

The weighting function for the DTM/TDM (default is term-frequency, effectively unweighted)

...

Extra arguments passed on to sparseMatrix()

Details

The arguments term, document, and value are passed by expression and support quasiquotation; you can unquote strings and symbols.


tidytext documentation built on May 29, 2024, 5:42 a.m.