validate_sentiment: Validate Sentiment Score Sign Against Known Results
In sentimentr: Calculate Text Polarity Sentiment

Description Usage Arguments Value Note References Examples

Provides a multiclass macroaverage/microaverage of precision, recall, accuracy, and F-score for the sign of the predicted sentiment against known sentiment scores. There are three classes sentiment analysis generally predicts: positive (> 0), negative (< 0) and neutral (= 0). In assessing model performance one can use macro- or micro- averaging across classes. Macroaveraging allows every class to have an equal say. Microaveraging gives larger say to larger classes.

1	validate_sentiment(predicted, actual, ...)

`predicted`	A numeric vector of predicted sentiment scores or a sentimentr object that returns sentiment scores.
`actual`	A numeric vector of known sentiment ratings.
`...`	ignored.

Returns a data.frame with a macroaveraged and microaveraged model validation scores. Additionally, the data.frame has the following attributes:

`confusion_matrix`	A confusion matrix of all classes
`class_confusion_matrices`	A `list` of class level (class vs. all) confusion matrices
`macro_stats`	A `data.frame` of the macroaverged class level stats before averaging
`mda`	Mean Directional Accuracy
`mare`	Mean Absolute Rescaled Error

Mean Absolute Rescaled Error (MARE) is defined as: \frac{∑{|actual - predicted|}}{2n} and gives a sense of, on average, how far off were the rescaled predicted values (-1 to 1) from the rescaled actual values (-1 to 1). A value of 0 means perfect accuracy. A value of 1 means perfectly wrong every time. A value of .5 represents expected value for random guessing. This measure is related to Mean Absolute Error.

https://www.youtube.com/watch?v=OwwdYHWRB5E&index=31&list=PL6397E4B26D00A269
https://en.wikipedia.org/wiki/Mean_Directional_Accuracy_(MDA)

actual <- c(1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, 1,-1)
predicted <- c(1, 0, 1, -1, 1, 0, -1, -1, -1, -1, 0, 1,-1)
validate_sentiment(predicted, actual)

scores <- hu_liu_cannon_reviews$sentiment
mod <- sentiment_by(get_sentences(hu_liu_cannon_reviews$text))

validate_sentiment(mod$ave_sentiment, scores)
validate_sentiment(mod, scores)

x <- validate_sentiment(mod, scores)
attributes(x)$confusion_matrix
attributes(x)$class_confusion_matrices
attributes(x)$macro_stats

## Annie Swafford Example
swafford <- data.frame(
    text = c(
        "I haven't been sad in a long time.",
        "I am extremely happy today.",
        "It's a good day.",
        "But suddenly I'm only a little bit happy.",
        "Then I'm not happy at all.",
        "In fact, I am now the least happy person on the planet.",
        "There is no happiness left in me.",
        "Wait, it's returned!",
        "I don't feel so bad after all!"
    ), 
    actual = c(.8, 1, .8, -.1, -.5, -1, -1, .5, .6), 
    stringsAsFactors = FALSE
)

pred <- sentiment_by(swafford$text) 
validate_sentiment(
    pred,
    actual = swafford$actual
)

Mean Directional Accuracy:    0.615 
Mean Absolute Rescaled Error: 0.269 

  average precision    recall  accuracy         F
1   macro 0.5277778 0.7416667 0.7435897 0.4603175
2   micro 0.6153846 0.6153846 0.7435897 0.6153846
Mean Directional Accuracy:    0.444 
Mean Absolute Rescaled Error: 0.15 

  average precision    recall  accuracy         F
1   macro 0.4887364 0.5478330 0.6292574 0.4071587
2   micro 0.4438861 0.4438861 0.6292574 0.4438861
Mean Directional Accuracy:    0.444 
Mean Absolute Rescaled Error: 0.15 

  average precision    recall  accuracy         F
1   macro 0.4887364 0.5478330 0.6292574 0.4071587
2   micro 0.4438861 0.4438861 0.6292574 0.4438861
      predicted
actual  -1   0   1
    -1  30   8  15
    0   96  72 190
    1   14   9 163
$`-1`
      predicted
actual  no yes
   no  434 110
   yes  23  30

$`0`
      predicted
actual  no yes
   no  222  17
   yes 286  72

$`1`
      predicted
actual  no yes
   no  206 205
   yes  23 163

  class precision    recall  accuracy         F
1    -1 0.2142857 0.5660377 0.7772194 0.3108808
2     0 0.8089888 0.2011173 0.4924623 0.3221477
3     1 0.4429348 0.8763441 0.6180905 0.5884477
Mean Directional Accuracy:    0.667 
Mean Absolute Rescaled Error: 0.196 

  average precision    recall  accuracy         F
1   macro 0.7777778 0.7666667 0.7777778 0.7662338
2   micro 0.6666667 0.6666667 0.7777778 0.6666667