LabelTopics: Get some topic labels using a "more probable" method of terms

Description Usage Arguments Value Examples

View source: R/topic_modeling_utilities.R

Description

Function calls GetProbableTerms with some rules to get topic labels. This function is in "super-ultra-mega alpha"; use at your own risk/discretion.

Usage

1
LabelTopics(assignments, dtm, M = 2)

Arguments

assignments

A documents by topics matrix similar to theta. This will work best if this matrix is sparse, with only a few non-zero topics per document.

dtm

A document term matrix of class matrix or dgCMatrix. The columns of dtm should be n-grams whose colnames have a "_" where spaces would be between the words.

M

The number of n-gram labels you want to return. Defaults to 2

Value

Returns a matrix whose rows correspond to topics and whose j-th column corresponds to the j-th "best" label assignment.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# make a dtm with unigrams and bigrams
data(nih_sample_topic_model)

m <- nih_sample_topic_model

assignments <- t(apply(m$theta, 1, function(x){
  x[ x < 0.05 ] <- 0
  x / sum(x)
}))

assignments[is.na(assignments)] <- 0

labels <- LabelTopics(assignments = assignments, dtm = m$data, M = 2)

textmineR documentation built on June 28, 2021, 9:08 a.m.