textmineR: Functions for Text Mining and Topic Modeling

An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.

Author"Thomas W. Jones <jones.thos.w@gmail.com>"
Date of publication2016-11-03 11:11:46
MaintainerThomas W. Jones <jones.thos.w@gmail.com>
LicenseGPL (>= 3)
Version2.0.4
https://github.com/TommyJones/textmineR

View on CRAN

Man pages

CalcHellingerDist: Calculate Hellinger Distance

CalcJSDivergence: Calculate Jensen-Shannon Divergence

CalcLikelihood: Calculate the log likelihood of a document term matrix given...

CalcPhiPrime: Calculate a matrix whose rows represent P(topic_i|tokens)

CalcProbCoherence: Probailistic coherence of topics

CalcTopicModelR2: Calculate the R-squared of a topic model.

Cluster2TopicModel: Represent a document clustering as a topic model

CorrectS: Function to remove some forms of pluralization.

CreateDtm: Convert a character vector to a document term matrix.

CreateTcm: Convert a character vector to a term co-occurence matrix.

DepluralizeDtm: Run the CorrectS function on columns of a document term...

Dtm2Docs: Convert a DTM to a Character Vector of documents

Dtm2Tcm: Turn a document term matrix into a term co-occurence matrix

Files2Vec: Function for reading text files into R

FitCtmModel: Fit a Correlated Topic Model

FitLdaModel: Fit a topic model using Latent Dirichlet Allocation

FitLsaModel: Fit a topic model using Latent Semantic Analysis

FormatRawLdaOutput: Format Raw Output from 'lda.collapsed.gibbs.sampler'

GetPhiPrime: Calculate a matrix whose rows represent P(topic_i|tokens)

GetProbableTerms: Get cluster labels using a "more probable" method of terms

GetTopTerms: Get Top Terms for each topic from a topic model

HellDist: Hellinger Distance

InternalFunctions: Internal helper functions for 'textmineR'

JSD: Jensen-Shannon Divergence

LabelTopics: Get some topic labels using a "more probable" method of terms

nih: Abstracts and metadata from NIH research grants awarded in...

RecursiveRbind: Recursively call rBind from the Matrix package.

TermDocFreq: Get term frequencies and document frequencies from a document...

TmParallelApply: An OS-independent parallel version of 'lapply'

Vec2Dtm: Convert a character vector to a document term matrix of class...

Files in this package

textmineR
textmineR/src
textmineR/src/CalcSumSquares.cpp
textmineR/src/JSD_cpp.cpp
textmineR/src/HellingerMat.cpp
textmineR/src/Dtm2DocsC.cpp
textmineR/src/JSDmat.cpp
textmineR/src/CalcLikelihoodC.cpp
textmineR/src/RcppExports.cpp
textmineR/src/Hellinger_cpp.cpp
textmineR/NAMESPACE
textmineR/data
textmineR/data/nih_sample_topic_model.rda
textmineR/data/nih_sample.rda
textmineR/data/nih_sample_dtm.rda
textmineR/R
textmineR/R/DepluralizeDtm.R textmineR/R/CalcHellingerDist.R textmineR/R/CreateTcm.R textmineR/R/CalcTopicModelR2.R textmineR/R/Dtm2Tcm.R textmineR/R/LabelTopics.R textmineR/R/Files2Vec.R textmineR/R/GetTopTerms.R textmineR/R/CalcJSDivergence.R textmineR/R/JSD.R textmineR/R/FitLdaModel.R textmineR/R/HellDist.R textmineR/R/GetPhiPrime.R textmineR/R/FitCtmModel.R textmineR/R/FitLsaModel.R textmineR/R/FormatRawLdaOutput.R textmineR/R/CalcProbCoherence.R textmineR/R/Vec2Dtm.R textmineR/R/CorrectS.R textmineR/R/RcppExports.R textmineR/R/Dtm2Docs.R textmineR/R/Cluster2TopicModel.R textmineR/R/CalcLikelihood.R textmineR/R/RecursiveRbind.R textmineR/R/TermDocFreq.R textmineR/R/GetProbableTerms.R textmineR/R/CreateDtm.R textmineR/R/CalcPhiPrime.R textmineR/R/TmParallelApply.R
textmineR/README.md
textmineR/MD5
textmineR/DESCRIPTION
textmineR/man
textmineR/man/nih.Rd textmineR/man/JSD.Rd textmineR/man/InternalFunctions.Rd textmineR/man/CalcProbCoherence.Rd textmineR/man/DepluralizeDtm.Rd textmineR/man/CreateDtm.Rd textmineR/man/TermDocFreq.Rd textmineR/man/Cluster2TopicModel.Rd textmineR/man/CalcPhiPrime.Rd textmineR/man/GetProbableTerms.Rd textmineR/man/CalcTopicModelR2.Rd textmineR/man/CorrectS.Rd textmineR/man/Dtm2Tcm.Rd textmineR/man/FitLsaModel.Rd textmineR/man/GetTopTerms.Rd textmineR/man/TmParallelApply.Rd textmineR/man/Files2Vec.Rd textmineR/man/CalcJSDivergence.Rd textmineR/man/HellDist.Rd textmineR/man/LabelTopics.Rd textmineR/man/FitLdaModel.Rd textmineR/man/CalcHellingerDist.Rd textmineR/man/RecursiveRbind.Rd textmineR/man/CalcLikelihood.Rd textmineR/man/CreateTcm.Rd textmineR/man/Dtm2Docs.Rd textmineR/man/FormatRawLdaOutput.Rd textmineR/man/Vec2Dtm.Rd textmineR/man/FitCtmModel.Rd textmineR/man/GetPhiPrime.Rd

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.