mallet.top.words: Get the most probable words and their probabilities for one...
In mallet: An R Wrapper for the Java Mallet Topic Modeling Toolkit

mallet.top.words

R Documentation

Get the most probable words and their probabilities for one topic

Description

This function returns a data frame with two columns, one containing the most probable words as character values, the second containing the weight assigned to that word in the word weights vector you supplied.

Usage

mallet.top.words(topic.model, word.weights, num.top.words = 10)

Arguments

`topic.model`	A `cc.mallet.topics.RTopicModel` object created by `MalletLDA`.
`word.weights`	A vector of word weights for one topic, usually a row from the `topic.words` matrix from `mallet.topic.words`.
`num.top.words`	The number of most probable words to return. If not specified, defaults to 10.

Value

a data.frame with the top terms (term) and their weights/probability (weight).

Examples

## Not run: 
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Extract top words
top_words <- mallet.top.words(topic.model, word.weights = topic_words[2,], num.top.words = 5)

## End(Not run)

mallet documentation built on July 20, 2022, 5:08 p.m.