Supreme: Make it easier applying LDA topic models to a corpus of...

This package provides tools that make it easier building a corpus of documents starting from the original xml files. It also provides a set of functions for reducing the dimensionality (number of columns) of obtained document-term matrix in both cases of supervised and unsupervised matrix and implements a new strategy for selecting the number of topics based on logistic classification. This strategy can be considered as an alternative to the general criterion of perplexity.


