sentiment_breakdown: Breakdown the sentiment into topical components

View source: R/timeSeries.R

sentiment_breakdownR Documentation

Breakdown the sentiment into topical components

Description

Break down the sentiment series obtained with sentiment_series() into topical components. Sentiment is broken down at the document level using estimated topic proportions, then processed to create a time series and its components.

Usage

sentiment_breakdown(
  x,
  period = c("year", "quarter", "month", "day", "identity"),
  rolling_window = 1,
  scale = TRUE,
  scaling_period = c("1900-01-01", "2099-12-31"),
  plot = c(FALSE, TRUE, "silent"),
  as.xts = TRUE,
  ...
)

plot_sentiment_breakdown(
  x,
  period = c("year", "quarter", "month", "day"),
  rolling_window = 1,
  scale = TRUE,
  scaling_period = c("1900-01-01", "2099-12-31"),
  ...
)

Arguments

x

a LDA() or rJST() model populated with internal dates and/or internal sentiment.

period

the sampling period within which the sentiment of documents will be averaged. period = "identity" is a special case that will return document-level variables before the aggregation happens. Useful to rapidly compute topical sentiment at the document level.

rolling_window

if greater than 1, determines the rolling window to compute a moving average of sentiment. The rolling window is based on the period unit and rely on actual dates (i.e, is not affected by unequally spaced data points).

scale

if TRUE, the resulting time series will be scaled to a mean of zero and a standard deviation of 1. This argument also has the side effect of attaching scaled sentiment values as docvars to the input object with the ⁠_scaled⁠ suffix.

scaling_period

the date range over which the scaling should be applied. Particularly useful to normalize only the beginning of the time series.

plot

if TRUE, prints a plot of the time series and attaches it as an attribute to the returned object. If 'silent', do not print the plot but still attaches it as an attribute.

as.xts

if TRUE, returns an xts::xts object. Otherwise, returns a data.frame.

...

other arguments passed on to zoo::rollapply() or mean() and sd().

Details

The sentiment is broken down at the sentiment level assuming the following composition:

s = \sum^K_{i=1} s_i \times \theta_i

, where s_i is the sentiment of topic i and theta_i the proportion of topic i in a given document. For an LDA model, the sentiment of each topic is considered equal to the document sentiment (i.e. s_i = s \forall i \in K). The topical sentiment attention, defined by s*_i = s_i \times \theta_i represent the effective sentiment conveyed by a topic in a document. The topical sentiment attention of all documents in a period are averaged to compute the breakdown of the sentiment time series.

Value

A time series of sentiment, stored as an xts::xts object or as a data.frame.

See Also

sentopics_sentiment sentopics_date

Other series functions: proportion_topics(), sentiment_series(), sentiment_topics()

Examples

lda <- LDA(ECB_press_conferences_tokens)
lda <- grow(lda, 100)
sentiment_breakdown(lda)

# plot shortcut
plot_sentiment_breakdown(lda)

# also available for rJST models (with topic-level sentiment)
rjst <- rJST(ECB_press_conferences_tokens, lexicon = LoughranMcDonald)
rjst <- grow(rjst, 100)
sentopics_sentiment(rjst, override = TRUE)
plot_sentiment_breakdown(rjst)

sentopics documentation built on May 31, 2023, 8:26 p.m.