mallet.top.words: Get the most probable words and their probabilities for one...

View source: R/mallet.R

mallet.top.wordsR Documentation

Get the most probable words and their probabilities for one topic

Description

This function returns a data frame with two columns, one containing the most probable words as character values, the second containing the weight assigned to that word in the word weights vector you supplied.

Usage

mallet.top.words(topic.model, word.weights, num.top.words = 10)

Arguments

topic.model

A cc.mallet.topics.RTopicModel object created by MalletLDA.

word.weights

A vector of word weights for one topic, usually a row from the topic.words matrix from mallet.topic.words.

num.top.words

The number of most probable words to return. If not specified, defaults to 10.

Value

a data.frame with the top terms (term) and their weights/probability (weight).

Examples

## Not run: 
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Extract top words
top_words <- mallet.top.words(topic.model, word.weights = topic_words[2,], num.top.words = 5)

## End(Not run)


mallet documentation built on July 20, 2022, 5:08 p.m.