DTMFromCorpus: Obtain a document-term matrix from corpus

Description Usage Arguments Details Value Note Author(s) Examples

View source: R/DTMFromCorpus.R

Description

Obtain a matrix, better known as document-term matrix (DTM), where rows correspond to documents and rows to terms.

Usage

1
DTMFromCorpus(corpus, rowNames)

Arguments

corpus

a corpus obtained from a bibliographic database.

rowNames

a list of row names for the resulting document-term matrix to bring traceability of the names of the articles from the initial database.

Details

A quick process for obtaining a document-term matrix from a text corpus. The chosen method for weighting this matrix is the binary method, so entries of this matrix are 1 if the i-th term belongs to the j-th document and zero otherwise.

Value

a matrix object i.e. a document-term matrix, weighted by the binary method.

Note

If rowNames argument is not provided, article indexes inside document-term matrix are going to be renumbered.

Author(s)

Andres Palacios anfpalacioscl@unal.edu.co

Examples

1
2
3
4
data("KDVizData")
data("KDCorpus")

myDTM <- DTMFromCorpus(corpus = KDCorpus, rowNames = row.names(KDVizData))

Example output



KDViz documentation built on May 1, 2019, 6:34 p.m.

Related to DTMFromCorpus in KDViz...