textmining: Integration of Text Mining and Topic Modeling Packages
Version 0.0.1

A framework for text mining and topic modelling. It provides an easy interface for using different topic modeling methods within R, by integrating the already existing packages. Full functionality of the package requires a local installation of 'TreeTagger'.

AuthorJan Idziak [cre], Maciej Eder [aut], Tomasz Melcer [aut], Andrew Goldstone [cph]
Date of publication2016-09-26 00:56:23
MaintainerJan Idziak <JanIdziak@gmail.com>
LicenseGPL-3
Version0.0.1
Package repositoryView on CRAN
InstallationInstall the latest version of this package by entering the following in R:
install.packages("textmining")

Getting started

README.md

Popular man pages

mallet_prepare: Helper function to use mallet topic modelling with tmCorpus
ngram: Function to create ngram docs
setMeta: Function to access meta data for textmining objects
tabler: Helper function for tabelarising documents
terms: Function to return the most frequent terms of tmTopicModels
tmCorpus: Function to create tmCorpus
tmMetaData: Function to create tmMetaData
See all...

All man pages Function index File listing

Man pages

as.tmCorpus: Create textmining Corpus
filter_documents: Function to filter tagged text
getDoc: Function to access documents for textmining objects
getMeta: Function to access meta data for textmining objects
make_tabled: Function to create tmWordCountsTable object from tmParsed
mallet_prepare: Helper function to use mallet topic modelling with tmCorpus
ngram: Function to create ngram docs
parse: Function to parse tmCorpus. As an outpus we have tmParsed...
predict: predict for 'tmTopicModel' object
setDoc: Function to change documents for textmining objects
setMeta: Function to access meta data for textmining objects
tabler: Helper function for tabelarising documents
terms: Function to return the most frequent terms of tmTopicModels
tmCorpus: Function to create tmCorpus
tmMetaData: Function to create tmMetaData
tmParsed: Function to create tmParsed
tmTaggedCorpus: Function to create tmTaggedCorpus
tmTextDocument: Function to create single tmTextDocument with meta data. The...
tmWordCountsTable: Function to create tmWordCountsTable
topic_network: Function to plot topic network
topic_table: Function to calculate topics and words arrays from the mallet...
topic_wordcloud: Simple wordcloud visualization of the topics.
train: train for 'tmCorpus' object

Functions

TermDocumentMatrix Source code
TermDocumentMatrix.tmCorpus Source code
as.tmCorpus Man page Source code
as.tmCorpus.VCorpus Source code
as.tmCorpus.default Source code
as.tmCorpus.stylo.corpus Source code
as.tmCorpus.tmTaggedCorpus Source code
c.tmCorpus Source code
compatible_instances Source code
content.character Source code
content.tmCorpus Source code
content.tmParsed Source code
content.tmTaggedCorpus Source code
content.tmTextDocument Source code
content.tmWordCountsTable Source code
filter_documents Man page Source code
format.tmCorpus Source code
getDoc Man page Source code
getDoc.default Source code
getDoc.tmTextDocument Source code
getMeta Man page Source code
getMeta.default Source code
getMeta.tmMetaData Source code
getMeta.tmTextDocument Source code
infer_topics Source code
inferencer Source code
instances_lengths Source code
make_tabled Man page Source code
mallet_prepare Man page Source code
meta.tmCorpus Source code
meta.tmParsed Source code
meta.tmTaggedCorpus Source code
meta.tmTextDocument Source code
ngram Man page Source code
parse Man page Source code
predict Man page
predict.LDA Man page Source code
predict.jobjRef Man page Source code
predict.tmTopicModel Man page Source code
predict_mallet_helper Source code
print.tmCorpus Source code
read_inferencer Source code
setDoc Man page Source code
setDoc.default Source code
setDoc.tmTextDocument Source code
setMeta Man page Source code
setMeta.default Source code
setMeta.tmMetaData Source code
setMeta.tmTextDocument Source code
sorted_topic_words Source code
stopwords_temp Source code
stopwords_temp_mallet Source code
tabler Man page Source code
tagdocument Source code
tagtmCorpus_helper Source code
termFreq_tm Source code
terms Man page Source code
terms.jobjRef Man page Source code
terms.tmTopicModel Man page Source code
tmCorpus Man page Source code
tmExternalCoprus Source code
tmExternalCoprus.stylo Source code
tmExternalCoprus.tm Source code
tmExternalParsedCoprus Source code
tmMetaData Man page Source code
tmParsed Man page Source code
tmReadDirCorpus Source code
tmTaggedCorpus Man page Source code
tmTaggedCorpus.list Source code
tmTaggedCorpus.tmCorpus Source code
tmTextDocument Man page Source code
tmTopicModel Source code
tmWordCountsTable Man page Source code
tm_filter.tmCorpus Source code
tm_filter.tmTaggedCorpus Source code
tm_index.tmCorpus Source code
tm_index.tmTaggedCorpus Source code
tm_map.tmCorpus Source code
topic_network Man page Source code
topic_table Man page Source code
topic_wordcloud Man page Source code
train Man page
train.DocumentTermMatrix Man page Source code
train.tmCorpus Man page Source code
train_mallet_helper Source code
train_topicmodels_helper Source code
words.PlainTextDocument Source code
write_inferencer Source code

Files

tests
tests/testthat.R
tests/testthat
tests/testthat/testExterior.R
tests/testthat/testTransformationFunctions.R
tests/testthat/testClasses.R
NAMESPACE
R
R/classes.R
R/get_set_functions.R
R/transformations.R
R/content.R
R/exterior.R
R/helper_predict.R
R/import.R
README.md
MD5
DESCRIPTION
man
man/setMeta.Rd
man/parse.Rd
man/tmMetaData.Rd
man/tmParsed.Rd
man/mallet_prepare.Rd
man/as.tmCorpus.Rd
man/topic_network.Rd
man/tmTextDocument.Rd
man/ngram.Rd
man/topic_wordcloud.Rd
man/setDoc.Rd
man/tmTaggedCorpus.Rd
man/predict.Rd
man/train.Rd
man/filter_documents.Rd
man/tabler.Rd
man/tmCorpus.Rd
man/getDoc.Rd
man/tmWordCountsTable.Rd
man/make_tabled.Rd
man/terms.Rd
man/topic_table.Rd
man/getMeta.Rd
textmining documentation built on May 19, 2017, 6:43 p.m.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.