mallet.topic.words: Retrieve a matrix of words weights for topics

View source: R/mallet.R

mallet.topic.wordsR Documentation

Retrieve a matrix of words weights for topics

Description

This function returns a matrix with one row for every topic and one column for every word in the vocabulary.

Usage

mallet.topic.words(topic.model, normalized = FALSE, smoothed = FALSE)

Arguments

topic.model

A cc.mallet.topics.RTopicModel object created by MalletLDA.

normalized

If TRUE, normalize the rows so that each topic sums to one. If FALSE, values will be integers (possibly plus the smoothing constant) representing the actual number of words of each type in the topics.

smoothed

If TRUE, add the smoothing parameter for the model (initial value specified as beta in MalletLDA). If FALSE, many values will be zero.

Value

a number of topics by vocabulary size matrix.

Examples

## Not run: 
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Extract results
doc_topics <- mallet.doc.topics(topic.model, smoothed=TRUE, normalized=TRUE)
topic_words <- mallet.topic.words(topic.model, smoothed=TRUE, normalized=TRUE)
top_words <- mallet.top.words(topic.model, word.weights = topic_words[2,], num.top.words = 5)

## End(Not run)


mallet documentation built on July 20, 2022, 5:08 p.m.