Description Usage Arguments Value
View source: R/topic_coherence.R
A function to calculate topic coherence for a given topic using the formulation in "Optimizing Semantic Coherence in Topic Models" available here: <http://dirichlet.net/pdf/mimno11optimizing.pdf>
1 2 | topic_coherence(top_words, document_term_matrix, vocabulary = NULL,
numeric_top_words = FALSE, K = length(top_words))
|
top_words |
A string vector of top words associated with a topic. If numeric_top_words == TRUE then a numeric vector of word indicies. |
document_term_matrix |
A numeric matrix or data.frame with dimensions number of documents X vocabulary length, where each entry is the count of word j in document i. |
vocabulary |
A string vector containing all words in the vocabulary. The vocaublary vector must have the same number of entries as the number of columns in the document_term_matrix, and the word indicated by entries in the i'th column of document_term_matrix must correspond to the i'th entry in vocabulary. If numeric_top_words == TRUE then it is not necessary to supply. |
numeric_top_words |
Defaults to FALSE. If TRUE, then the function expects a vector of word indicies instead of a string vector of actual words. |
K |
The number of top words to use in calculating the topic coherence. Defaults to the lneght of top_words. Common values are usually in the range of 10-20. |
The coherence score for the given topic.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.