generate_document_term_matrix: A function to generate a document term matrix from a list of...

Description Usage Arguments Value

View source: R/generate_document_term_matrix.R

Description

A function to generate a document term matrix from a list of document term vectors.

Usage

1
2

Arguments

document_term_vector_list

A list of term vectors, one per document, that we wish to turn into a document term matrix.

vocabulary

An optional vocabulary vector which will be used to form the document term matrix. Defaults to NULL, in which case a vocabulary vector will be generated internally.

document_term_count_list

A list of vectors of word counts can optionally be provided, in which case we will aggregate over them. This can be useful if we wish to store documents in a memory efficent way. Defaults to NULL.

return_sparse_matrix

Defualts to FALSE, in whih case a normal dense matrix is returned. If TRUE, then a sparse matrix object generated by the slam library is returned. A sparse matrix representation is also used in the C++ code if this is set to TRUE, which can result in drastic memory savings.

Value

A dense document term matrix object with the vocabulary as column names.


matthewjdenny/SpeedReader documentation built on March 25, 2020, 5:32 p.m.