Dtm2Docs: Convert a DTM to a Character Vector of documents

View source: R/corpus_functions.R

Dtm2DocsR Documentation

Convert a DTM to a Character Vector of documents

Description

This function takes a sparse matrix (DTM) as input and returns a character vector whose length is equal to the number of rows of the input DTM.

Usage

Dtm2Docs(dtm, ...)

Arguments

dtm

A sparse Matrix from the matrix package whose rownames correspond to documents and colnames correspond to words

...

Other arguments to be passed to TmParallelApply. See note, below.

Value

Returns a character vector. Each entry of this vector corresponds to the rows of dtm.

Note

This function performs parallel computation if dtm has more than 3,000 rows. The default is to use all available cores according to detectCores. However, this can be modified by passing the cpus argument when calling this function.

Examples

# Load a pre-formatted dtm and topic model
data(nih_sample)
data(nih_sample_dtm) 

# see the original documents
nih_sample$ABSTRACT_TEXT[ 1:3 ]

# see the new documents re-structured from the DTM
new_docs <- Dtm2Docs(dtm = nih_sample_dtm)

new_docs[ 1:3 ]


TommyJones/textmineR documentation built on July 26, 2023, 9:51 p.m.