chainsDistances: Distances between topic models (chains)
In sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

chainsDistances

R Documentation

Distances between topic models (chains)

Description

Computes the distance between different estimates of a topic model. Since the estimation of a topic model is random, the results may largely differ as the process is repeated. This function allows to compute the distance between distinct realizations of the estimation process. Estimates are referred to as chains.

Usage

chainsDistances(
  x,
  method = c("euclidean", "hellinger", "cosine", "minMax", "naiveEuclidean",
    "invariantEuclidean"),
  ...
)

Arguments

`x`	a valid `multiChains` object, obtained through the estimation of a topic model using `fit()` and the argument `nChains` greater than `1`.
`method`	the method used to measure the distance between chains.
`...`	further arguments passed to internal distance functions.

Details

The method argument determines how are computed distance.

euclidean finds the pairs of topics that minimizes and returns the total Euclidean distance.
hellinger does the same but based on the Hellinger distance.
cosine does the same but based on the Cosine distance.
minMax computes the maximum distance among the best pairs of distances. Inspired by the minimum-matching distance from Tang et al. (2014).
naiveEuclidean computes the Euclidean distance without searching for the best pairs of topics.
invariantEuclidean computes the best pairs of topics for all allowed permutations of topic indices. For JST and reversed-JST models, the two- levels hierarchy of document-sentiment-topic leads some permutations of indices to represent a drastically different outcome. This setting restricts the set of permutations to the ones that do not change the interpretation of the model. Equivalent to euclidean for LDA models.

Value

A matrix of distance between the elements of x

Author(s)

Olivier Delmarcelle

References

Tang, J., Meng, Z., Nguyen, X., Mei, Q., and Zhang, M. (2014). Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis. In Proceedings of the 31st International Conference on Machine Learning, 32, 90–198.

Examples

model <- LDA(ECB_press_conferences_tokens)
model <- fit(model, 10, nChains = 5)
chainsDistances(model)

sentopics documentation built on Sept. 20, 2024, 5:06 p.m.

sentopics index

Package overview README.md Basic usage" Topical time series"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sentopics
Tools for Joint Sentiment and Topic Analysis of Textual Data

chainsDistances: Distances between topic models (chains)
In sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

Distances between topic models (chains)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to chainsDistances in sentopics...

R Package Documentation

Browse R Packages

We want your feedback!

sentopics Tools for Joint Sentiment and Topic Analysis of Textual Data

chainsDistances: Distances between topic models (chains) In sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

Distances between topic models (chains)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to chainsDistances in sentopics...

R Package Documentation

Browse R Packages

We want your feedback!

sentopics
Tools for Joint Sentiment and Topic Analysis of Textual Data

chainsDistances: Distances between topic models (chains)
In sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data