calclift: Calculate Lift Words

Description Usage Arguments Details References See Also

View source: R/STMfunctions.R

Description

A primarily internal function for calculating words according to the lift metric. We expect most users will use labelTopics instead.

Usage

1
calclift(logbeta, wordcounts)

Arguments

logbeta

a K by V matrix containing the log probabilities of seeing word v conditional on topic k

wordcounts

a V length vector indicating the number of times each word appears in the corpus.

Details

Lift is the calculated by dividing the topic-word distribution by the empirical word count probability distribution. In other words the Lift for word v in topic k can be calculated as:

Lift = β/wbar

We include this after seeing it used effectively in Matt Taddy's work including his excellent maptpx package. Definitions are given in Taddy(2012).

References

Taddy, Matthew. 2012. "On Estimation and Selection for Topic Models." AISTATS JMLR W&CP 22

See Also

labelTopics


stm documentation built on Jan. 13, 2021, 10:45 a.m.