fit_lda_c: Main C++ Gibbs sampler for Latent Dirichlet Allocation
In tidylda: Latent Dirichlet Allocation Using 'tidyverse' Conventions

fit_lda_c

R Documentation

Main C++ Gibbs sampler for Latent Dirichlet Allocation

Description

This is the C++ Gibbs sampler for LDA. "Abandon all hope, ye who enter here."

Usage

fit_lda_c(
  Docs,
  Zd_in,
  Cd_in,
  Cv_in,
  Ck_in,
  alpha_in,
  eta_in,
  iterations,
  burnin,
  optimize_alpha,
  calc_likelihood,
  Beta_in,
  freeze_topics,
  threads = 1L,
  verbose = TRUE
)

Arguments

`Docs`	List with one element for each document and one entry for each token as formatted by `initialize_topic_counts`
`Zd_in`	List with one element for each document and one entry for each token as formatted by `initialize_topic_counts`
`Cd_in`	IntegerMatrix denoting counts of topics in documents
`Cv_in`	IntegerMatrix denoting counts of tokens in topics
`Ck_in`	IntegerVector denoting counts of topics across all tokens
`alpha_in`	NumericVector prior for topics over documents
`eta_in`	NumericMatrix for prior of tokens over topics
`iterations`	int number of gibbs iterations to run in total
`burnin`	int number of burn in iterations
`optimize_alpha`	bool do you want to optimize alpha each iteration?
`calc_likelihood`	bool do you want to calculate the log likelihood each iteration?
`Beta_in`	NumericMatrix denoting probability of tokens in topics
`freeze_topics`	bool if making predictions, set to `TRUE`
`threads`	unsigned integer, how many parallel threads? For now, nothing is actually parallel
`verbose`	bool do you want to print out a progress bar?

Details

Arguments ending in _in are copied and their copies modified in some way by this function. In the case of eta_in and Beta_in, the only modification is that they are converted from matrices to nested std::vector for speed, reliability, and thread safety. In the case of all others, they may be explicitly modified during training.

Value

Returns a list with the following entries.

Cd is a matrix counting the number of times each topic is sampled per document.

Cv is a matrix counting the number of times each topic is sampled per token.

Cd_mean the same as Cd but values averaged across iterations greater than burnin iterations.

Cv_mean the same as Cv but values averaged across iterations greater than burnin iterations.

Cd_sum the same as Cd but values summed across iterations greater than burnin iterations.

Cv_sum the same as Cv but values summed across iterations greater than burnin iterations.

log_likelihood a matrix with one row indexing iterations and one row of the log likelihood for each iteration.

alpha a vector of the document-topic prior

_eta a matrix of the topic-token prior

tidylda documentation built on May 29, 2024, 11:03 a.m.

tidylda index

Package overview README.md Introduction to tidylda Probabilistic Coherence Transfer Learning with LDA (tLDA)

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tidylda
Latent Dirichlet Allocation Using 'tidyverse' Conventions

fit_lda_c: Main C++ Gibbs sampler for Latent Dirichlet Allocation
In tidylda: Latent Dirichlet Allocation Using 'tidyverse' Conventions

Main C++ Gibbs sampler for Latent Dirichlet Allocation

Description

Usage

Arguments

Details

Value

Related to fit_lda_c in tidylda...

R Package Documentation

Browse R Packages

We want your feedback!

tidylda Latent Dirichlet Allocation Using 'tidyverse' Conventions

fit_lda_c: Main C++ Gibbs sampler for Latent Dirichlet Allocation In tidylda: Latent Dirichlet Allocation Using 'tidyverse' Conventions

Main C++ Gibbs sampler for Latent Dirichlet Allocation

Description

Usage

Arguments

Details

Value

Related to fit_lda_c in tidylda...

R Package Documentation

Browse R Packages

We want your feedback!

tidylda
Latent Dirichlet Allocation Using 'tidyverse' Conventions

fit_lda_c: Main C++ Gibbs sampler for Latent Dirichlet Allocation
In tidylda: Latent Dirichlet Allocation Using 'tidyverse' Conventions