Home

/

CRAN

/

lda.svi

/

lda_svi: Fit a Latent Dirichlet Allocation model to a text corpus

lda_svi: Fit a Latent Dirichlet Allocation model to a text corpus
In lda.svi: Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference

Description Usage Arguments Details Value References Examples

View source: R/lda_svi.R

Fit a Latent Dirichlet Allocation model to a text corpus

1
2
3

lda_svi(dtm, passes = 10, batchsize = 256, maxiter = 100, K,
  eta = 1/K, alpha = 1/K, kappa = 0.7, tau_0 = 1024,
  tidy_output = TRUE)

`dtm`	This must be a DocumentTermMatrix (with term frequency weighting) from the tm package.
`passes`	The number of passes over the whole corpus - how many times we update the local variables for each document.
`batchsize`	The size of the minibatches.
`maxiter`	The maximum iterations for the "E step" for each document (the updating of the per-document parameters within each minibatch). The default of 100 follows the reference implementation in python by the authors.
`K`	The number of topics
`eta`	Dirichlet prior hyperparameter for the document-specific topic proportions.
`alpha`	Dirichlet prior hyperparameter for the topic-specific term proportions.
`kappa`	learning rate parameter. Lower values give greater weight to later iterations. For guaranteed convergence to a local optimum, kappa must lie in the interval (0.5,1].
`tau_0`	learning rate parameter. Higher values reduce the influence of early iterations.
`tidy_output`	if true, the parameter estimates are returned as 'long' data frames; otherwise they are returned as matrices.

The implementation here is based on the python implementation by Matthew D. Hoffman accompanying the paper

A named list of length two. The element named 'beta' gives the proportions for the terms within the topics, while the element named 'theta' gives the proportions for the topics within the documents. If the tidy_output argument is true these are data frames in 'long' format; otherwise they are matrices.

Hoffman, M., Bach, FM., and Blei, DM. (2010) 'Online Learning for Latent Dirichlet Allocation', _Conference and Workshop on Neural Information Processing Systems_

Hoffman, M., Blei, DM., Wang, C, and Paisley, J. (2013) 'Stochastic Variational Inference', _Journal of Machine Learning Research_. Preprint: https://arxiv.org/abs/1206.7051_

library(topicmodels)
data(AssociatedPress)
ap_lda_fit <- lda_svi(AssociatedPress,passes=1,K=50)
#I use a single pass because CRAN requires examples to run quickly; 
#generally one would use more. 20 often seems to be sufficient as a rule of thumb,
#but it might be worth experimenting with more or fewer

lda.svi documentation built on July 12, 2019, 5:03 p.m.

lda.svi index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lda.svi
Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference

lda_svi: Fit a Latent Dirichlet Allocation model to a text corpus
In lda.svi: Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference

Description

Usage

Arguments

Details

Value

References

Examples

Related to lda_svi in lda.svi...

R Package Documentation

Browse R Packages

We want your feedback!

lda.svi Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference

lda_svi: Fit a Latent Dirichlet Allocation model to a text corpus In lda.svi: Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference

Description

Usage

Arguments

Details

Value

References

Examples

Related to lda_svi in lda.svi...

R Package Documentation

Browse R Packages

We want your feedback!

lda.svi
Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference

lda_svi: Fit a Latent Dirichlet Allocation model to a text corpus
In lda.svi: Fit Latent Dirichlet Allocation Models using Stochastic Variational Inference