top_words: Topic key words

top_wordsR Documentation

Topic key words

Description

The most common way to summarize topics is to list their top-weighted words, together with their topic weights. Though every topic assigns some probability to every word in the whole vocabulary, we often disregard all but its most frequent words.

Usage

top_words(m, ...)

Arguments

m

a mallet_model object

n

number of top words per topic to return (omit for all available)

weighting

a function to transform the full topic-word matrix before calculating top-ranked words. If NULL, taken to be identity. Other possibilities include tw_blei_lafferty and tw_sievert_shirley.

Details

The data frame returned by this function supplies no new information not already present in the topic-word matrix; it is in effect an aggressively sparse representation of the full topic-word matrix. But it is so commonly used that it makes more sense to store it on its own. Indeed, when analyzing model outputs, one will often prefer to load just this data frame and the doc-topics matrix into memory, rather than the full topic-word matrix.

Value

a data frame with three columns, topic (indexed from 1), word (character), and weight

See Also

tw_blei_lafferty, tw_sievert_shirley


agoldst/dfrtopics documentation built on July 15, 2022, 4:13 p.m.