polarity: Polarity Score (Sentiment Analysis)

Description Usage Arguments Details Value Note References See Also Examples

Description

Approximate the sentiment (polarity) of text by grouping variable(s).

Usage

1
2
3
4
5
6
  polarity(text.var, grouping.var = NULL,
    positive.list = positive.words,
    negative.list = negative.words,
    negation.list = negation.words,
    amplification.list = increase.amplification.words,
    rm.incomplete = FALSE, digits = 3, ...)

Arguments

text.var

The text variable.

grouping.var

The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.

positive.list

A character vector of terms indicating positive reaction.

negative.list

A character vector of terms indicating negative reaction.

negation.list

A character vector of terms reversing the intent of a positive or negative word.

amplification.list

A character vector of terms that increases the intensity of a positive or negative word.

rm.incomplete

logical. If TRUE text rows ending with qdap's incomplete sentence end mark (|) will be removed from the analysis.

digits

Integer; number of decimal places to round when printing.

...

Other arguments supplied to end_inc.

Details

The equation used by the algorithm to assign value to polarity to each sentence fist utilizes the sentiment dictionary (Hu and Liu, 2004) to tag each word as either positive (x_i^{+}), negative (x_i^{-}), neutral (x_i^{0}), negator(x_i\neg), or amplifier (x_i^{\uparrow}). Neutral words hold no value in the equation but do affect word count (n). Each positive (x_i^{+}) and negative (x_i^{-}) word is then weighted by the amplifiers (x_i^{\uparrow}) directly proceeding the positive or negative word. Next, I consider amplification value, adding the assigned value 1/n-1 to increase the polarity relative to sentence length while ensuring that the polarity scores will remain between the values -1 and 1. This weighted value for each polarized word is then multiplied by -1 to the power of the number of negated (x_i\neg) words directly proceeding the positive or negative word. Last, these values are then summed and divided by the word count (n) yielding a polarity score (δ) between -1 and 1.

δ=\frac{∑(x_i^{0},\quad x_i^{\uparrow} + x_i^{+}\cdot(-1)^{∑(x_i\neg)},\quad x_i^{\uparrow} + x_i^{-}\cdot(-1)^{∑(x_i\neg)})}{n}

Where:

x_i^{\uparrow}=\frac{1}{n-1}

Value

Returns a list of:

all

A dataframe of scores per row with:

  • group.var - the grouping variable

  • text.var - the text variable

  • wc - word count

  • polarity - sentence polarity score

  • raw - raw polarity score (considering only positive and negative words)

  • negation.adj.raw - raw adjusted for negation words

  • amplification.adj.raw - raw adjusted for amplification words

  • pos.words - words considered positive

  • neg.words - words considered negative

group

A dataframe with the average polarity score by grouping variable.

digits

integer value od number of digits to display; mostly internal use

Note

The polarity score is dependent upon the polarity dictionary used. This function defaults to the word polarity word dictionary used by Hu, M., & Liu, B. (2004), however, this may not be appropriate for the context of children in a classroom. The user may (is encouraged) to provide/augment the dictionary. For instance the word "sick" in a high school setting may mean that something is good, whereas "sick" used by a typical adult indicates something is not right or negative connotation.

Also note that polarity assumes you've run sentSplit.

References

Hu, M., & Liu, B. (2004). Mining opinion features in customer reviews. National Conference on Artificial Intelligence.

http://www.slideshare.net/jeffreybreen/r-by-example-mining-twitter-for

See Also

https://github.com/trestletech/Sermon-Sentiment-Analysis

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
(poldat <- with(DATA, polarity(state, person)))
with(DATA, polarity(state, list(sex, adult)))
names(poldat)
truncdf(poldat$all, 8)
poldat$group
poldat2 <- with(mraja1spl, polarity(dialogue,
    list(sex, fam.aff, died)))
colsplit2df(poldat2$group)
plot(poldat)

poldat3 <- with(rajSPLIT, polarity(dialogue, person))
poldat3[["group"]][, "OL"] <- outlier.labeler(poldat3[["group"]][,
    "ave.polarity"])
poldat3[["all"]][, "OL"] <- outlier.labeler(poldat3[["all"]][,
    "polarity"])
head(poldat3[["group"]], 10)
htruncdf(poldat3[["all"]], 15, 8)
plot(poldat3)
plot(poldat3, nrow=4)

trinker/qdap2 documentation built on May 31, 2019, 9:47 p.m.