knitr::opts_chunk$set(collapse = T) library(TextSentiment) library(ggplot2) library(httr) library(jsonlite)
This document introduces you to TextSentiment’s set of tools, and shows you how to apply them to data frames to generate sentiment using Microsoft Cognitive Services API.
Before using this package, make sure that you have a valid Microsoft account have generated a valid Text Analytics API key for any one of the regions. Save your API key and region information as it will be required for extracting sentiment score. This package creates a wrapper around the RESTful API using POST method to get information.
For more information about the Microsoft API, click here
We will be using a dummy dataset to generate sentiment analysis. Make sure that the dataset has 'text' and 'language' column and in the same format as requested by the API. The dummy dataset has 268 rows and 4 columns.
dataset <- read.csv('dataset.csv') dim(dataset) # text examples in dataset head(dataset)
get_batch_sentiment():Batch sentiment request function lets you to send request to API for sentiment score in batch of 100 documents/sentences and is suggested to use this function if number of data rows are more than 100. The sample dataset has 268 rows and batch request method should be used in this case.
data_with_sentiment <- get_batch_sentiment(dataset, "cc2fb843f77948d592054109844a9152", "westcentralus") head(data_with_sentiment)
Note: An additional 'Id' column will also be created for the dataset passed to the function
sentiment_dist_plot():The function produces a density or boxplot for the sentiment score generated using the Microsoft sentiment analysis API. Function divides data into negative, neutral and positive sentiment based on user input values and produces distribution plot for each of the sentiment class.
After getting results, we can analyze the results in density plot or boxplot graph type. The funtion allows to classify the sentiment score in 3 categories - Negative, Neutral and Positive. The 'negative_cutoff' (default value = 0.35) and 'positive_start' (default value = 0.65) parameters govern the sentiment score range for all the 3 classes.
# 'density' parameter in graph type for density distribution plot sentiment_dist_plot(data_with_sentiment, negative_cutoff = 0.35, positive_start = 0.65, graph_alpha = 0.5, graph_type = 'density') # 'boxplot' parameter in graph type for generating boxplot for each category sentiment_dist_plot(data_with_sentiment, negative_cutoff = 0.20, positive_start = 0.70, graph_alpha = 0.75, graph_type = 'boxplot')
The above distribution plots further help in analyzing the score within each class to better understand sentiment in each class.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.