This package provides tools that make it easier building a corpus of documents starting from the original xml files. It also provides a set of functions for reducing the dimensionality (number of columns) of obtained document-term matrix in both cases of supervised and unsupervised matrix and implements a new strategy for selecting the number of topics based on logistic classification. This strategy can be considered as an alternative to the general criterion of perplexity.
Package details |
|
---|---|
Author | Paolo Fantini <paolo.fantini@uniroma1.com> |
Maintainer | Paolo Fantini <paolo.fantini@uniroma1.com> |
License | GPL-2 |
Version | 0.1.0 |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.