Cluster2TopicModel: Represent a document clustering as a topic model

Description Usage Arguments Value Examples

View source: R/topic_modeling_core.R

Description

Represents a document clustering as a topic model of two matrices. phi: P(term | cluster) theta: P(cluster | document)

Usage

1
Cluster2TopicModel(dtm, clustering, ...)

Arguments

dtm

A document term matrix of class dgCMatrix or whose class inherits from the Matrix package. Columns must index terms, rows must index documents.

clustering

A vector of length nrow(dtm) whose entries form a partitional clustering of the documents.

...

Other arguments to be passed to TmParallelApply.

Value

Returns a list with two elements, phi and theta. 'phi' is a matrix whose j-th row represents P(terms | cluster_j). 'theta' is a matrix whose j-th row represents P(clusters | document_j). Each row of theta should only have one non-zero element.

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
# Load pre-formatted data for use
data(nih_sample_dtm)
data(nih_sample) 

result <- Cluster2TopicModel(dtm = nih_sample_dtm, 
                             clustering = nih_sample$IC_NAME)

## End(Not run)

Example output

Loading required package: Matrix

Attaching package: 'textmineR'

The following object is masked from 'package:Matrix':

    update

The following object is masked from 'package:stats':

    update

textmineR documentation built on June 28, 2021, 9:08 a.m.