sentopicmodel: Create a sentopic model

View source: R/models.R

sentopicmodelR Documentation

Create a sentopic model

Description

The set of functions LDA(), JST(), rJST() and sentopicmodel() are all wrappers to an unified C++ routine and attempt to replicate their corresponding model. This function is the lower level wrapper to the C++ routine.

Usage

sentopicmodel(
  x,
  lexicon = NULL,
  L1 = 5,
  L2 = 3,
  L1prior = 1,
  L2prior = 5,
  beta = 0.01,
  L1cycle = 0,
  L2cycle = 0,
  initLDA = 0,
  smooth = 0,
  reversed = TRUE
)

Arguments

x

tokens object containing the texts. A coercion will be attempted if x is not a tokens.

lexicon

a quanteda dictionary with positive and negative categories

L1

the number of labels in the first document mixture layer

L2

the number of labels in the second document mixture layer

L1prior

the first layer hyperparameter of document mixtures

L2prior

the second layer hyperparameter of document mixtures

beta

the hyperparameter of vocabulary distribution

L1cycle

integer specifying the cycle size between two updates of the hyperparameter L1prior

L2cycle

integer specifying the cycle size between two updates of the hyperparameter L2prior

initLDA

integer specifying the number of iterations of the LDA burn-in

smooth

integer specifying the number of iterations of the smoothed burn-in

Value

An S3 list containing the model parameter and the estimated mixture. This object corresponds to a Gibbs sampler estimator with zero iterations. The MCMC can be iterated using the grow() function.

  • tokens is the tokens object used to create the model

  • vocabulary contains the set of words of the corpus

  • it tracks the number of Gibbs sampling iterations

  • za is the list of topic assignment, aligned to the tokens object with padding removed

  • logLikelihood returns the measured log-likelihood at each iteration, with a breakdown of the likelihood into hierarchical components as attribute

The topWords() function easily extract the most probables words of each topic/sentiment.

Author(s)

Olivier Delmarcelle

See Also

Growing a model: grow(), extracting top words: topWords()

Other topic models: JST(), LDA(), rJST()

Examples

LDA(ECB_press_conferences_tokens)
rJST(ECB_press_conferences_tokens, lexicon = LoughranMcDonald)

sentopics documentation built on May 31, 2023, 8:26 p.m.