mallet.topic.labels: Get strings containing the most probable words for each topic

View source: R/mallet.R

mallet.topic.labelsR Documentation

Get strings containing the most probable words for each topic

Description

This function returns a vector of strings, one for each topic, with the most probable words in that topic separated by spaces.

Usage

mallet.topic.labels(topic.model, topic.words = NULL, num.top.words = 3, ...)

Arguments

topic.model

A cc.mallet.topics.RTopicModel object created by MalletLDA.

topic.words

The matrix of topic-word weights returned by mallet.topic.words Default (NULL) is to use the topic.model to extract the topic.words.

num.top.words

The number of words to include for each topic. Defaults to 3.

...

Further arguments supplied to mallet.topic.words.

Value

a character vector with one element per topic

See Also

mallet.topic.words produces topic-word weights. mallet.top.words produces a data frame for a single topic.

Examples

## Not run: 
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Create hiearchical clusters of topics
doc_topics <- mallet.doc.topics(topic.model, smoothed=TRUE, normalized=TRUE)
topic_words <- mallet.topic.words(topic.model, smoothed=TRUE, normalized=TRUE)
topic_labels <- mallet.topic.labels(topic.model)
plot(mallet.topic.hclust(doc_topics, topic_words, balance = 0.3), labels=topic_labels)

## End(Not run)


mallet documentation built on July 20, 2022, 5:08 p.m.