CalcGamma: Calculate a matrix whose rows represent P(topic_i|tokens)

Description Usage Arguments Value Examples

View source: R/topic_modeling_core.R

Description

This function takes a phi matrix (P(token|topic)) and a theta matrix (P(topic|document)) and returns the phi prime matrix (P(topic|token)). Phi prime can be used for classifying new documents and for alternative topic labels.

Usage

1
CalcGamma(phi, theta, p_docs = NULL, correct = TRUE)

Arguments

phi

The phi matrix whose rows index topics and columns index words. The i, j entries are P(word_i | topic_j)

theta

The theta matrix whose rows index documents and columns index topics. The i, j entries are P(topic_i | document_j)

p_docs

A numeric vector of length nrow(theta) that is proportional to the number of terms in each document. This is an optional argument. It defaults to NULL

correct

Logical. Do you want to set NAs or NaNs in the final result to zero? Useful when hitting computational underflow. Defaults to TRUE. Set to FALSE for troubleshooting or diagnostics.

Value

Returns a matrix whose rows correspond to topics and whose columns correspond to tokens. The i,j entry corresponds to P(topic_i|token_j)

Examples

1
2
3
4
5
6
# Load a pre-formatted dtm and topic model
data(nih_sample_topic_model) 

# Make a gamma matrix, P(topic|words)
gamma <- CalcGamma(phi = nih_sample_topic_model$phi, 
                   theta = nih_sample_topic_model$theta)

textmineR documentation built on June 28, 2021, 9:08 a.m.