sentiments: Sentiment lexicons from four sources

Description Usage Format Details Source

Description

Four lexicons for sentiment analysis are combined here in a tidy data frame. The lexicons are the NRC Emotion Lexicon from Saif Mohammad and Peter Turney, the sentiment lexicon from Bing Liu and collaborators, of Finn Arup Nielsen, and of Tim Loughran and Bill McDonald. Words with non-ASCII characters were removed from the lexicons.

Usage

1

Format

A data frame with 27,314 rows and 4 variables:

word

An English word

sentiment

A sentiment whose possible values depend on the lexicon. The "afinn" lexicon has no sentiment category (all are NA), and each of the others can be "positive" or "negative". The NRC lexicon can also be "anger", "anticipation", "disgust", "fear", "joy", "sadness", "surprise", or "trust", and the Loughran lexicon can also be "litigious", "uncertainty", "constraining", and "superfluous".

lexicon

The source of the sentiment for the word. One of either "nrc", "bing", "loughran", or "AFINN".

score

A numerical score for the sentiment. This value is NA for the Bing, NRC, and Loughran lexicons, and runs between -5 and 5 for the AFINN lexicon.

Details

Note that the Loughran lexicon is best suited for financial text, (e.g. where "share" is not necessarily positive and "liability" not necessarily negative).

Source


tidytext documentation built on Oct. 17, 2018, 9:04 a.m.