knitr::opts_chunk$set( echo = TRUE, comment = "#", collapse = TRUE, fig.path = "man/figures/README-", fig.width = 8, fig.height = 5 )
A stable version sentopics
is available on CRAN:
install.packages("sentopics")
The latest development version can be installed from GitHub:
``` {r eval = FALSE} devtools::install_github("odelmarcelle/sentopics")
The development version requires the appropriate tools to compile C++ and Fortran source code. ## Basic usage Using a sample of press conferences from the European Central Bank, an LDA model is easily created from a list of tokenized texts. See https://quanteda.io for details on `tokens` input objects and pre-processing functions. ``` {r} library("sentopics") print(ECB_press_conferences_tokens, 2) set.seed(123) lda <- LDA(ECB_press_conferences_tokens, K = 3, alpha = .1) lda <- fit(lda, 100) lda
There are various way to extract results from the model: it is either possible to directly access the estimated mixtures from the lda
object or to use some helper functions.
# The document-topic distributions head(lda$theta) # The document-topic in a 'long' format & optionally with meta-data head(melt(lda, include_docvars = FALSE)) # The most probable words per topic topWords(lda, output = "matrix")
Two visualization are also implemented: plot_topWords()
display the most probable words and plot()
summarize the topic proportions and their top words.
plot(lda)
plot(lda) |> plotly::layout(width = 500, height = 500)
After properly incorporating date and sentiment metadata data (if they are not already present in the tokens
input), time series functions allows to study the evolution of topic proportions and related sentiment.
sentopics_date(lda) |> head(2) sentopics_sentiment(lda) |> head(2) proportion_topics(lda, period = "month") |> head(2) plot_sentiment_breakdown(lda, period = "quarter", rolling_window = 3)
Feel free to refer to the vignettes of the package for a more extensive introduction to the features of the package. Because the package is not yet on CRAN, you'll have to build the vignettes locally.
vignette("Basic_usage", package = "sentopics") vignette("Topical_time_series", package = "sentopics")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.