In odelmarcelle/sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

knitr::opts_chunk$set(
  echo = TRUE,
  comment = "#",
  collapse = TRUE,
  fig.path = "man/figures/README-",
  fig.width = 8,
  fig.height = 5
  )

sentopics

Installation

A stable version sentopics is available on CRAN:

install.packages("sentopics")

The latest development version can be installed from GitHub:

``` {r eval = FALSE} devtools::install_github("odelmarcelle/sentopics")

The development version requires the appropriate tools to compile C++ and Fortran source code.

## Basic usage

Using a sample of press conferences from the European Central Bank, an LDA model is easily created from a list of tokenized texts. See https://quanteda.io for details on `tokens` input objects and pre-processing functions.

``` {r}
library("sentopics")
print(ECB_press_conferences_tokens, 2)
set.seed(123)
lda <- LDA(ECB_press_conferences_tokens, K = 3, alpha = .1)
lda <- fit(lda, 100)
lda

There are various way to extract results from the model: it is either possible to directly access the estimated mixtures from the lda object or to use some helper functions.

# The document-topic distributions
head(lda$theta) 
# The document-topic in a 'long' format & optionally with meta-data
head(melt(lda, include_docvars = FALSE))
# The most probable words per topic
topWords(lda, output = "matrix")

Two visualization are also implemented: plot_topWords() display the most probable words and plot() summarize the topic proportions and their top words.

plot(lda)

plot(lda) |> plotly::layout(width = 500, height = 500)

After properly incorporating date and sentiment metadata data (if they are not already present in the tokens input), time series functions allows to study the evolution of topic proportions and related sentiment.

sentopics_date(lda)  |> head(2)
sentopics_sentiment(lda) |> head(2)
proportion_topics(lda, period = "month") |> head(2)
plot_sentiment_breakdown(lda, period = "quarter", rolling_window = 3)

Advanced usage

Feel free to refer to the vignettes of the package for a more extensive introduction to the features of the package. Because the package is not yet on CRAN, you'll have to build the vignettes locally.

vignette("Basic_usage", package = "sentopics")
vignette("Topical_time_series", package = "sentopics")

odelmarcelle/sentopics documentation built on Jan. 10, 2025, 2:58 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

odelmarcelle/sentopics
Tools for Joint Sentiment and Topic Analysis of Textual Data

In odelmarcelle/sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

sentopics

Installation

Advanced usage

R Package Documentation

Browse R Packages

We want your feedback!

odelmarcelle/sentopics Tools for Joint Sentiment and Topic Analysis of Textual Data

In odelmarcelle/sentopics: Tools for Joint Sentiment and Topic Analysis of Textual Data

sentopics

Installation

Advanced usage

R Package Documentation

Browse R Packages

We want your feedback!

odelmarcelle/sentopics
Tools for Joint Sentiment and Topic Analysis of Textual Data