Description Usage Arguments Value Author(s) References Examples
Iteratively produces models and then compares the harmonic mean of the log likelihoods in a graphical output.
1 2 3 4 |
x |
A |
max.k |
Maximum number of topics to fit (start small [i.e., default of 30] and add as necessary). |
harmonic.mean |
Logical. If |
method |
The method to be used for fitting; currently
|
drop.seed |
Logical. If |
burnin |
Object of class |
iter |
Object of class |
keep |
Object of class |
... |
Other arguments passed to |
Returns the data.frame
of k (nuber of topics) and
the associated log likelihood.
Ben Marwick and Tyler Rinker <tyler.rinker@gmail.com>.
http://stackoverflow.com/a/21394092/1000343
http://stats.stackexchange.com/a/25128/7482
Ponweiser, M. (2012). Latent Dirichlet Allocation in R (Diploma Thesis).
Vienna University of Economics and Business, Vienna.
http://epub.wu.ac.at/3558/1/main.pdf
Griffiths, T.L., and Steyvers, M. (2004). Finding scientific topics.
Proceedings of the National Academy of Sciences of the United States of America
101(Suppl 1), 5228 - 5235. http://www.pnas.org/content/101/suppl_1/5228.full.pdf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ## Install/Load Tools & Data
if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("trinker/gofastr")
pacman::p_load(tm, topicmodels, dplyr, tidyr, devtools, LDAvis, ggplot2)
## Source topicmodels2LDAvis function
devtools::source_url("https://gist.githubusercontent.com/trinker/477d7ae65ff6ca73cace/raw/79dbc9d64b17c3c8befde2436fdeb8ec2124b07b/topicmodels2LDAvis")
data(presidential_debates_2012)
## Generate Stopwords
stops <- c(
tm::stopwords("english"),
"governor", "president", "mister", "obama","romney"
) %>%
gofastr::prep_stopwords()
## Create the DocumentTermMatrix
doc_term_mat <- presidential_debates_2012 %>%
with(gofastr::q_dtm_stem(dialogue, paste(person, time, sep = "_"))) %>%
gofastr::remove_stopwords(stops) %>%
gofastr::filter_tf_idf() %>%
gofastr::filter_documents()
opti_k1 <- optimal_k(doc_term_mat)
opti_k1
opti_k2 <- optimal_k(doc_term_mat, harmonic.mean = FALSE)
opti_k2
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.