Description Usage Arguments Details Value Note References See Also Examples
Approximate the sentiment (polarity) of text by grouping variable(s).
1 2 3 4 5 6 | polarity(text.var, grouping.var = NULL,
positive.list = positive.words,
negative.list = negative.words,
negation.list = negation.words,
amplification.list = increase.amplification.words,
rm.incomplete = FALSE, digits = 3, ...)
|
text.var |
The text variable. |
grouping.var |
The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables. |
positive.list |
A character vector of terms indicating positive reaction. |
negative.list |
A character vector of terms indicating negative reaction. |
negation.list |
A character vector of terms reversing the intent of a positive or negative word. |
amplification.list |
A character vector of terms that increases the intensity of a positive or negative word. |
rm.incomplete |
logical. If TRUE text rows ending
with qdap's incomplete sentence end mark ( |
digits |
Integer; number of decimal places to round when printing. |
... |
Other arguments supplied to
|
The equation used by the algorithm to assign value to polarity to each sentence fist utilizes the sentiment dictionary (Hu and Liu, 2004) to tag each word as either positive (x_i^{+}), negative (x_i^{-}), neutral (x_i^{0}), negator(x_i\neg), or amplifier (x_i^{\uparrow}). Neutral words hold no value in the equation but do affect word count (n). Each positive (x_i^{+}) and negative (x_i^{-}) word is then weighted by the amplifiers (x_i^{\uparrow}) directly proceeding the positive or negative word. Next, I consider amplification value, adding the assigned value 1/n-1 to increase the polarity relative to sentence length while ensuring that the polarity scores will remain between the values -1 and 1. This weighted value for each polarized word is then multiplied by -1 to the power of the number of negated (x_i\neg) words directly proceeding the positive or negative word. Last, these values are then summed and divided by the word count (n) yielding a polarity score (δ) between -1 and 1.
δ=\frac{∑(x_i^{0},\quad x_i^{\uparrow} + x_i^{+}\cdot(-1)^{∑(x_i\neg)},\quad x_i^{\uparrow} + x_i^{-}\cdot(-1)^{∑(x_i\neg)})}{n}
Where:
x_i^{\uparrow}=\frac{1}{n-1}
Returns a list of:
all |
A dataframe of scores per row with:
|
group |
A dataframe with the average polarity score by grouping variable. |
digits |
integer value od number of digits to display; mostly internal use |
The polarity score is dependent upon the polarity dictionary used. This function defaults to the word polarity word dictionary used by Hu, M., & Liu, B. (2004), however, this may not be appropriate for the context of children in a classroom. The user may (is encouraged) to provide/augment the dictionary. For instance the word "sick" in a high school setting may mean that something is good, whereas "sick" used by a typical adult indicates something is not right or negative connotation.
Also note that polarity
assumes
you've run sentSplit
.
Hu, M., & Liu, B. (2004). Mining opinion features in customer reviews. National Conference on Artificial Intelligence.
http://www.slideshare.net/jeffreybreen/r-by-example-mining-twitter-for
https://github.com/trestletech/Sermon-Sentiment-Analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | (poldat <- with(DATA, polarity(state, person)))
with(DATA, polarity(state, list(sex, adult)))
names(poldat)
truncdf(poldat$all, 8)
poldat$group
poldat2 <- with(mraja1spl, polarity(dialogue,
list(sex, fam.aff, died)))
colsplit2df(poldat2$group)
plot(poldat)
poldat3 <- with(rajSPLIT, polarity(dialogue, person))
poldat3[["group"]][, "OL"] <- outlier.labeler(poldat3[["group"]][,
"ave.polarity"])
poldat3[["all"]][, "OL"] <- outlier.labeler(poldat3[["all"]][,
"polarity"])
head(poldat3[["group"]], 10)
htruncdf(poldat3[["all"]], 15, 8)
plot(poldat3)
plot(poldat3, nrow=4)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.