This Package provides three categories of important functions: frequency Analysis of word tokens, Creation of Document Term Matrix and Topic Modelling using LDA.
Frequency Analysis of word tokens - returns dataframe with words and their frequencies after initial preprocessing, sparsity control and TFIDF analysis is performed.we can pick some words from the high frequency list as custom stop words
Creation of Document Term Matrix -repeats first step, now including the custom stop words as well, removes empty documents if any and returns a Document term matrix. This DTM is used for finding optimal number of topics for LDA modelling using 'FindTopicsNumber' from 'ldatuning' package
Topic Modelling- Performs preprocessing along with removal of custom stop words,Uses topic number selected using 'ldatuning' and builds unigram topic model with/without stemming. Returns,
A list of zero length documents after preprocessing
A data frame with top 20 terms in all the topics discovered by LDA.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.