rJST: Create a Reversed Joint Sentiment/Topic model
In sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

View source: R/models.R

rJST	R Documentation

Create a Reversed Joint Sentiment/Topic model

Description

This function initialize a Reversed Joint Sentiment/Topic model.

Usage

rJST(x, ...)

## Default S3 method:
rJST(
  x,
  lexicon = NULL,
  K = 5,
  S = 3,
  alpha = 1,
  gamma = 5,
  beta = 0.01,
  alphaCycle = 0,
  gammaCycle = 0,
  ...
)

## S3 method for class 'LDA'
rJST(x, lexicon = NULL, S = 3, gamma = 5, ...)

Arguments

`x`	tokens object containing the texts. A coercion will be attempted if `x` is not a tokens.
`...`	not used
`lexicon`	a `quanteda` dictionary with positive and negative categories
`K`	the number of topics
`S`	the number of sentiments
`alpha`	the hyperparameter of topic-document distribution
`gamma`	the hyperparameter of sentiment-document distribution
`beta`	the hyperparameter of vocabulary distribution
`alphaCycle`	integer specifying the cycle size between two updates of the hyperparameter alpha
`gammaCycle`	integer specifying the cycle size between two updates of the hyperparameter alpha

Details

The rJST.LDA methods enable the transition from a previously estimated LDA model to a sentiment-aware rJST model. The function retains the previously estimated topics and randomly assigns sentiment to every word of the corpus. The new model will retain the iteration count of the initial LDA model.

Value

An S3 list containing the model parameter and the estimated mixture. This object corresponds to a Gibbs sampler estimator with zero iterations. The MCMC can be iterated using the fit() function.

tokens is the tokens object used to create the model
vocabulary contains the set of words of the corpus
it tracks the number of Gibbs sampling iterations
za is the list of topic assignment, aligned to the tokens object with padding removed
logLikelihood returns the measured log-likelihood at each iteration, with a breakdown of the likelihood into hierarchical components as attribute

The topWords() function easily extract the most probables words of each topic/sentiment.

Author(s)

Olivier Delmarcelle

References

Lin, C. and He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM conference on Information and knowledge management, 375–384.

Lin, C., He, Y., Everson, R. and Ruger, S. (2012). Weakly Supervised Joint Sentiment-Topic Detection from Text. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/TKDE.2011.48")}. IEEE Transactions on Knowledge and Data Engineering, 24(6), 1134–-1145.

Examples

# simple rJST model
rJST(ECB_press_conferences_tokens)

# estimating a rJST model including lexicon
rjst <- rJST(ECB_press_conferences_tokens, lexicon = LoughranMcDonald)
rjst <- fit(rjst, 100)

# from an LDA model:
lda <- LDA(ECB_press_conferences_tokens)
lda <- fit(lda, 100)

# creating a rJST model out of it
rjst <- rJST(lda, lexicon = LoughranMcDonald)
# topic proportions remain identical
identical(lda$theta, rjst$theta)
# model should be iterated to estimate sentiment proportions
rjst <- fit(rjst, 100)

sentopics documentation built on July 29, 2026, 5:07 p.m.