This version is a patch. In this version I have
CalcHellignerDist()
and CalcJSDivergence()
that sometimes
caused inputs to be overwritten.FitCtmModel()
to better explain how to pass
control arguments to CTM's underlying function.tibble
or data.frame
(instead of only data.frame
) in
the following functions: SummarizeTopics
, GetTopTerms
, TermDocFreq
(Thanks to Mattias for the PR)This version is a patch. In this version I have
FitLdaModel
This version is a patch. In this version I have
update.lda_topic_model
method.posterior.lda_topic_model
to sample from the posterior of an
LDA topic model. This version is a patch. In this version I have
tidytext
alongside textmineR
This version is a patch in response to issues revealed by automatic checks upon submission to CRAN plus an additional issue I encountered along the way.
I have Used the CRAN template for my MIT LICENSE file Modified the example of the LabelTopics function to speed up run time for that example Modified vignettes to run in less time Added a Makevars file to keep compiled code small on Ubuntu.
Please read below for major updates between v2.x.x and v3.x.x
This version significantly changes textmineR.
CalcPhiPrime
FitLdaModel has changed significantly.
R-squared is optionally calculated at the time of model fit.
Supported topic models (LDA, LSA, CTM) are now object-oriented, creating their own S3 classes. These classes have their own predict methods, meaning you do not have to do your own math to make predictions for new documents.
A new function SummarizeTopics has been added.
tm is no longer a dependency for stopwords. We now use the stopwords package. The extended result of this is that there is no longer any Java dependency.
Several packages have been moved from "Imports" to "Suggests". The result is a faster install and lower likelihood of install failure based on packages with system dependencies. (Looking at you, topicmodels!)
Finally, I have changed the textmineR license to the MIT license. Note, however, that some dependencies may have more restrictive licenses. So if you're looking to use textmineR in a commercial project, you may want to dig deeper into what is/isn't permissable.
CalcProbCoherence
CreateTcm
text2vec
API. Functionality is unchanged.verbose
option to CreateDtm
and CreateTcm
to supress status messages.GetVocabFromDtm
to get text2vec
vocabulary object from a dgCMatrix
document term matrix.CreateDtm
and CreateTcm
in response to updates to text2vec
.text2vec
's latest optimizations to follow.CreateDtm
and CreateTcm
. remove_punctuation now supports non-English
characters.TmParallelApply
. Added an option to declare the environment to search
for your export list. Default to that argument just searches the local
environment. The default should cover ~95% of use cases. (And avoids crash on
Windows OS)FitLdaModel
. Use of the ...
argument now allows you to control
TmParallelApply
, lda::lda.collapsed.gibbs.sampler
, and topicmodels::LDA
without error.FitCtmModel
where the ...
argument now goes to topicmodels::CTM
's
control
argument.CreateTcm
to return objects of class dgCMatrix
. This allows you to
run functions like FitLdaModel
on a TCM.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.