README.md
In BrianWeinstein/googlenlp: An Interface to Google's Cloud Natural Language API

The googlenlp package provides an R interface to Google's Cloud Natural Language API.

"Google Cloud Natural Language API reveals the structure and meaning of text by offering powerful machine learning models in an easy to use REST API. You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app." [source]

There are four main features of the API, all of which are available through this R package [source]:

Syntax Analysis: "Extract tokens and sentences, identify parts of speech (PoS) and create dependency parse trees for each sentence."
Entity Analysis: "Identify entities and label by types such as person, organization, location, events, products and media."
Sentiment Analysis: "Understand the overall sentiment expressed in a block of text."
Multi-Language: "Enables you to easily analyze text in multiple languages including English, Spanish and Japanese."

The current googlenlp release can be installed from CRAN:

install.packages("googlenlp")

The newest development release can be installed from GitHub:

# install.packages('devtools')
devtools::install_github("BrianWeinstein/googlenlp")

To use the API, you'll first need to create a Google Cloud project and enable billing, and get an API key.

Load the package and set your API key. There are two ways to do this.

Method A (preferred)

Method A (preferred method) adds your API key as a variable to your .Renviron file. Under this method, you only need to do this setup process one time.

library(googlenlp)

configure_googlenlp() # follow the instructions printed to the console

googlenlp setup instructions:
 1. Your ~/.Renviron file will now open in a new window/tab.
    *** If it doesn't open, run:  file.edit("~/.Renviron") ***
 2. To use the API, you'll first need to create a Google Cloud project and enable billing (https://cloud.google.com/natural-language/docs/getting-started).
 3. Next you'll need to get an API key (https://cloud.google.com/natural-language/docs/common/auth).
 4. In your  ~/.Renviron  file, replace the ENTER_YOUR_API_KEY_HERE with your Google Cloud API key.
 5. Save your ~/.Renviron file.
 6. *** Restart your R session for changes to take effect. ***

Method B

Method B defines your API key as a session-level variable. Under this method, you'll need to set your API key at the beginning of each R session.

library(googlenlp)

set_api_key("MY_API_KEY") # replace this with your API key

Define the text you'd like to analyze.

text <- "Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.
         Sundar Pichai said in his keynote that users love their new Android phones."

The annotate_text function analyzes the text's syntax (sentences and tokens), entities, sentiment, and language; and returns the result as a five-element list.

analyzed <- annotate_text(text_body = text)
#> Warning: package 'bindrcpp' was built under R version 3.4.4

str(analyzed, max.level = 1)
#> List of 5
#>  $ sentences        :Classes 'tbl_df', 'tbl' and 'data.frame':   2 obs. of  4 variables:
#>  $ tokens           :Classes 'tbl_df', 'tbl' and 'data.frame':   32 obs. of  17 variables:
#>  $ entities         :Classes 'tbl_df', 'tbl' and 'data.frame':   10 obs. of  8 variables:
#>  $ documentSentiment:'data.frame':   1 obs. of  2 variables:
#>  $ language         : chr "en"

Sentences

"Sentence extraction breaks up the stream of text into a series of sentences." [API Documentation]

beginOffset indicates the (zero-based) character index of where the sentence begins (wtih UTF-8 encoding).
The magnitude and score fields quantify each sentence's sentiment — see the Document Sentiment section for more details.

analyzed$sentences

content beginOffset magnitude score Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show. 0 0.0 0.0 Sundar Pichai said in his keynote that users love their new Android phones. 113 0.6 0.6

Tokens

"Tokenization breaks the stream of text up into a series of tokens, with each token usually corresponding to a single word. The Natural Language API then processes the tokens and, using their locations within sentences, adds syntactic information to the tokens." [API Documentation]

lemma indicates the token's "root" word, and can be useful in standardizing the word within the text.
tag indicates the token's part of speech.
Additional column definitions are outlined here and here.

analyzed$tokens

content beginOffset lemma tag aspect case form gender mood number person proper reciprocity tense voice dependencyEdge_headTokenIndex dependencyEdge_label Google 0 Google NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 NSUBJ , 6 , PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 0 P headquartered 8 headquarter VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PAST VOICE_UNKNOWN 0 VMOD in 22 in ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 2 PREP Mountain 25 Mountain NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 5 NN View 34 View NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 3 POBJ , 38 , PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 0 P unveiled 40 unveil VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN INDICATIVE NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PAST VOICE_UNKNOWN 7 ROOT the 49 the DET ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 11 DET new 53 new ADJ ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 11 AMOD Android 57 Android NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 11 NN phone 65 phone NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 DOBJ at 71 at ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 PREP the 74 the DET ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 16 DET Consumer 78 Consumer NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 16 NN Electronic 87 Electronic NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 16 NN Show 98 Show NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 12 POBJ . 102 . PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 P Sundar 113 Sundar NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 19 NN Pichai 120 Pichai NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 20 NSUBJ said 127 say VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN INDICATIVE NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PAST VOICE_UNKNOWN 20 ROOT in 132 in ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 20 PREP his 135 his PRON ASPECT_UNKNOWN GENITIVE FORM_UNKNOWN MASCULINE MOOD_UNKNOWN SINGULAR THIRD PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 23 POSS keynote 139 keynote NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 21 POBJ that 147 that ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 26 MARK users 152 user NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN PLURAL PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 26 NSUBJ love 158 love VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN INDICATIVE NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PRESENT VOICE_UNKNOWN 20 CCOMP their 163 their PRON ASPECT_UNKNOWN GENITIVE FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN PLURAL THIRD PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 30 POSS new 169 new ADJ ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 30 AMOD Android 173 Android NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 30 NN phones 181 phone NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN PLURAL PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 26 DOBJ . 187 . PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 20 P

Entities

"Entity Analysis provides information about entities in the text, which generally refer to named 'things' such as famous individuals, landmarks, common objects, etc... A good general practice to follow is that if something is a noun, it qualifies as an 'entity.'" [API Documentation]

entity_type indicates the type of entity (i.e., it classifies the entity as a person, location, consumer good, etc.).
mid provides a "machine-generated identifier" correspoding to the entity's Google Knowledge Graph entry.
wikipedia_url provides the entity's Wikipedia URL.
salience indicates the entity's importance to the entire text. Scores range from 0.0 (less important) to 1.0 (highly important).
Additional column definitions are outlined here.

analyzed$entities

name entity_type mid wikipedia_url salience content beginOffset mentions_type Google ORGANIZATION /m/045c7b https://en.wikipedia.org/wiki/Google 0.2557206 Google 0 PROPER users PERSON NA NA 0.1527633 users 152 COMMON phone CONSUMER_GOOD NA NA 0.1311989 phone 65 COMMON Android CONSUMER_GOOD /m/02wxtgw https://en.wikipedia.org/wiki/Android_(operating_system) 0.1224526 Android 57 PROPER Android CONSUMER_GOOD /m/02wxtgw https://en.wikipedia.org/wiki/Android_(operating_system) 0.1224526 Android 173 PROPER Sundar Pichai PERSON /m/09gds74 https://en.wikipedia.org/wiki/Sundar_Pichai 0.1141411 Sundar Pichai 113 PROPER Mountain View LOCATION /m/0r6c4 https://en.wikipedia.org/wiki/Mountain_View,_California 0.1019596 Mountain View 25 PROPER Consumer Electronic Show EVENT /m/01p15w https://en.wikipedia.org/wiki/Consumer_Electronics_Show 0.0703438 Consumer Electronic Show 78 PROPER phones CONSUMER_GOOD NA NA 0.0338317 phones 181 COMMON keynote OTHER NA NA 0.0175884 keynote 139 COMMON

Document sentiment

"Sentiment analysis attempts to determine the overall attitude (positive or negative) expressed within the text. Sentiment is represented by numerical score and magnitude values." [API Documentation]

score ranges from -1.0 (negative) to 1.0 (positive), and indicates to the "overall emotional leaning of the text".
magnitude "indicates the overall strength of emotion (both positive and negative) within the given text, between 0.0 and +inf. Unlike score, magnitude is not normalized; each expression of emotion within the text (both positive and negative) contributes to the text's magnitude (so longer text blocks may have greater magnitudes)."

A note on how to interpret these sentiment values is posted here.

analyzed$documentSentiment

| magnitude| score| |----------:|------:| | 0.6| 0.3|

Language

language indicates the detected language of the document. Only English ("en"), Spanish ("es") and Japanese ("ja") are currently supported by the API.

analyzed$language
#> [1] "en"

BrianWeinstein/googlenlp documentation built on May 6, 2019, 8:47 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

BrianWeinstein/googlenlp
An Interface to Google's Cloud Natural Language API

README.md
In BrianWeinstein/googlenlp: An Interface to Google's Cloud Natural Language API

googlenlp

Resources

Installation

Authentication

Configuration

Method A (preferred)

Method B

Getting started

Sentences

Tokens

Entities

Document sentiment

Language

R Package Documentation

Browse R Packages

We want your feedback!

BrianWeinstein/googlenlp An Interface to Google's Cloud Natural Language API

README.md In BrianWeinstein/googlenlp: An Interface to Google's Cloud Natural Language API

googlenlp

Resources

Installation

Authentication

Configuration

Method A (preferred)

Method B

Getting started

Sentences

Tokens

Entities

Document sentiment

Language

R Package Documentation

Browse R Packages

We want your feedback!

BrianWeinstein/googlenlp
An Interface to Google's Cloud Natural Language API

README.md
In BrianWeinstein/googlenlp: An Interface to Google's Cloud Natural Language API