Casting a data frame to a DocumentTermMatrix, TermDocumentMatrix, or dfm

Share:

Description

This turns a "tidy" one-term-per-dopument-per-row data frame into a DocumentTermMatrix or TermDocumentMatrix from the tm package, or a dfm from the quanteda package. Each caster can be called either with non-standard evaluation (bare column names) or character vectors (for cast_tdm_ and cast_dtm_). It ignores groups.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
cast_tdm_(data, term_col, document_col, value_col, weighting = tm::weightTf,
  ...)

cast_tdm(data, term, document, value, weighting = tm::weightTf, ...)

cast_dtm_(data, document_col, term_col, value_col, weighting = tm::weightTf,
  ...)

cast_dtm(data, document, term, value, weighting = tm::weightTf, ...)

cast_dfm_(data, document_col, term_col, value_col, ...)

cast_dfm(data, document, term, value, ...)

Arguments

data

Table with one-term-per-document-per-row

weighting

The weighting function for the DTM/TDM (default is term-frequency, effectively unweighted)

...

Extra arguments passed on to sparseMatrix

term, term_col

(Bare) name of a column with terms

document, document_col

(Bare) name of a column with documents

value, value_col

(Bare) name of a column containing values