bigram_adjustment: Bigram negation adjustment
In jiwanheo/senTWEETment: Analyze Twitter Sentiment

Makes adjustments to assign a negative score to phrases like "I am not happy", that would have gotten a positive score, had it not been adjusted (looking at single word at a time).

1	bigram_adjustment(lexicons, tweets_by_id, negation_words, stop_words)

`lexicons`	Lexicons to use, A named list of tibbles.
`tweets_by_id`	Texts of tweets, processed in `produce_analysis_df`
`negation_words`	Negation words to use from TweetAnalysis R6 class. A character vector.
`stop_words`	Stop words to use from TweetAnalysis R6 class. A tibble.

Traditionally, a single word tokenization results in a single row of "word to sentiment value" per word. This function tokenizes the texts with 2 words. Any token that has as the first word, a negative word per negation_words, instead gets two rows. One with the full 2-word token, and another row with the original word. The sentiment value of both rows is the sentiment value of the original word multiplied by -1. Then both these rows are appended to the 1-word-tokenized tibble, and are summed at the word/tweet level, canceling out the original word's sentiment, and adding the bigram sentiment. Since we are specifically looking only for the negative words, stop words will exclude negation words.

jiwanheo/senTWEETment documentation built on Jan. 20, 2022, 3:20 a.m.