dtm_as_bow | R Documentation |
Generate Gensim input from R.
dtm_as_bow(dtm) dtm_as_dictionary(dtm)
dtm |
A 'DocumentTermMatrix'. |
The input to gensim's LDA modelling methods is a representation of corpora in a data format denoted as "BOW". This utility function 'dtm_as_bow()' turns a sparse matrix (class 'simple_triplet_matrix') into the bow input format required by gensim.
Andreas Blaette
if (requireNamespace("reticulate") && reticulate::py_module_available("gensim")){ library(polmineR) use("RcppCWB", corpus = "REUTERS") dtm <- corpus("REUTERS") %>% split(s_attribute = "id") %>% as.DocumentTermMatrix(p_attribute = "word", verbose = FALSE) bow <- dtm_as_bow(dtm) dict <- dtm_as_dictionary(dtm) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.