README.md

googlenlp

Travis-CI Build Status

The googlenlp package provides an R interface to Google's Cloud Natural Language API.

"Google Cloud Natural Language API reveals the structure and meaning of text by offering powerful machine learning models in an easy to use REST API. You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app." [source]

There are four main features of the API, all of which are available through this R package [source]:

Resources

Installation

You can install the development version from GitHub:

devtools::install_github("BrianWeinstein/googlenlp")

Authentication

To use the API, you'll first need to create a Google Cloud project and enable billing, and get an API key.

Configuration

Load the package and set your API key. There are two ways to do this.

Method A (preferred)

Method A (preferred method) adds your API key as a variable to your .Renviron file. Under this method, you only need to do this setup process one time.

library(googlenlp)

configure_googlenlp() # follow the instructions printed to the console
googlenlp setup instructions:
 1. Your ~/.Renviron file will now open in a new window/tab.
    *** If it doesn't open, run:  file.edit("~/.Renviron") ***
 2. To use the API, you'll first need to create a Google Cloud project and enable billing (https://cloud.google.com/natural-language/docs/getting-started).
 3. Next you'll need to get an API key (https://cloud.google.com/natural-language/docs/common/auth).
 4. In your  ~/.Renviron  file, replace the ENTER_YOUR_API_KEY_HERE with your Google Cloud API key.
 5. Save your ~/.Renviron file.
 6. *** Restart your R session for changes to take effect. ***

Method B

Method B defines your API key as a session-level variable. Under this method, you'll need to set your API key at the beginning of each R session.

library(googlenlp)

set_api_key("MY_API_KEY") # replace this with your API key

Getting started

Define the text you'd like to analyze.

text <- "Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show.
         Sundar Pichai said in his keynote that users love their new Android phones."

The annotate_text function analyzes the text's syntax (sentences and tokens), entities, sentiment, and language; and returns the result as a five-element list.

analyzed <- annotate_text(text_body = text)
#> Warning: package 'bindrcpp' was built under R version 3.4.4

str(analyzed, max.level = 1)
#> List of 5
#>  $ sentences        :Classes 'tbl_df', 'tbl' and 'data.frame':   2 obs. of  4 variables:
#>  $ tokens           :Classes 'tbl_df', 'tbl' and 'data.frame':   32 obs. of  17 variables:
#>  $ entities         :Classes 'tbl_df', 'tbl' and 'data.frame':   10 obs. of  8 variables:
#>  $ documentSentiment:'data.frame':   1 obs. of  2 variables:
#>  $ language         : chr "en"

Sentences

"Sentence extraction breaks up the stream of text into a series of sentences." [API Documentation]

analyzed$sentences
content beginOffset magnitude score Google, headquartered in Mountain View, unveiled the new Android phone at the Consumer Electronic Show. 0 0.0 0.0 Sundar Pichai said in his keynote that users love their new Android phones. 113 0.6 0.6

Tokens

"Tokenization breaks the stream of text up into a series of tokens, with each token usually corresponding to a single word. The Natural Language API then processes the tokens and, using their locations within sentences, adds syntactic information to the tokens." [API Documentation]

analyzed$tokens
content beginOffset lemma tag aspect case form gender mood number person proper reciprocity tense voice dependencyEdge_headTokenIndex dependencyEdge_label Google 0 Google NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 NSUBJ , 6 , PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 0 P headquartered 8 headquarter VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PAST VOICE_UNKNOWN 0 VMOD in 22 in ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 2 PREP Mountain 25 Mountain NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 5 NN View 34 View NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 3 POBJ , 38 , PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 0 P unveiled 40 unveil VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN INDICATIVE NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PAST VOICE_UNKNOWN 7 ROOT the 49 the DET ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 11 DET new 53 new ADJ ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 11 AMOD Android 57 Android NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 11 NN phone 65 phone NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 DOBJ at 71 at ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 PREP the 74 the DET ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 16 DET Consumer 78 Consumer NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 16 NN Electronic 87 Electronic NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 16 NN Show 98 Show NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 12 POBJ . 102 . PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 7 P Sundar 113 Sundar NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 19 NN Pichai 120 Pichai NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 20 NSUBJ said 127 say VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN INDICATIVE NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PAST VOICE_UNKNOWN 20 ROOT in 132 in ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 20 PREP his 135 his PRON ASPECT_UNKNOWN GENITIVE FORM_UNKNOWN MASCULINE MOOD_UNKNOWN SINGULAR THIRD PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 23 POSS keynote 139 keynote NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 21 POBJ that 147 that ADP ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 26 MARK users 152 user NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN PLURAL PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 26 NSUBJ love 158 love VERB ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN INDICATIVE NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN PRESENT VOICE_UNKNOWN 20 CCOMP their 163 their PRON ASPECT_UNKNOWN GENITIVE FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN PLURAL THIRD PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 30 POSS new 169 new ADJ ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 30 AMOD Android 173 Android NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN SINGULAR PERSON_UNKNOWN PROPER RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 30 NN phones 181 phone NOUN ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN PLURAL PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 26 DOBJ . 187 . PUNCT ASPECT_UNKNOWN CASE_UNKNOWN FORM_UNKNOWN GENDER_UNKNOWN MOOD_UNKNOWN NUMBER_UNKNOWN PERSON_UNKNOWN PROPER_UNKNOWN RECIPROCITY_UNKNOWN TENSE_UNKNOWN VOICE_UNKNOWN 20 P

Entities

"Entity Analysis provides information about entities in the text, which generally refer to named 'things' such as famous individuals, landmarks, common objects, etc... A good general practice to follow is that if something is a noun, it qualifies as an 'entity.'" [API Documentation]

analyzed$entities
name entity_type mid wikipedia_url salience content beginOffset mentions_type Google ORGANIZATION /m/045c7b https://en.wikipedia.org/wiki/Google 0.2557206 Google 0 PROPER users PERSON NA NA 0.1527633 users 152 COMMON phone CONSUMER_GOOD NA NA 0.1311989 phone 65 COMMON Android CONSUMER_GOOD /m/02wxtgw https://en.wikipedia.org/wiki/Android_(operating_system) 0.1224526 Android 57 PROPER Android CONSUMER_GOOD /m/02wxtgw https://en.wikipedia.org/wiki/Android_(operating_system) 0.1224526 Android 173 PROPER Sundar Pichai PERSON /m/09gds74 https://en.wikipedia.org/wiki/Sundar_Pichai 0.1141411 Sundar Pichai 113 PROPER Mountain View LOCATION /m/0r6c4 https://en.wikipedia.org/wiki/Mountain_View,_California 0.1019596 Mountain View 25 PROPER Consumer Electronic Show EVENT /m/01p15w https://en.wikipedia.org/wiki/Consumer_Electronics_Show 0.0703438 Consumer Electronic Show 78 PROPER phones CONSUMER_GOOD NA NA 0.0338317 phones 181 COMMON keynote OTHER NA NA 0.0175884 keynote 139 COMMON

Document sentiment

"Sentiment analysis attempts to determine the overall attitude (positive or negative) expressed within the text. Sentiment is represented by numerical score and magnitude values." [API Documentation]

A note on how to interpret these sentiment values is posted here.

analyzed$documentSentiment

| magnitude| score| |----------:|------:| | 0.6| 0.3|

Language

language indicates the detected language of the document. Only English ("en"), Spanish ("es") and Japanese ("ja") are currently supported by the API.

analyzed$language
#> [1] "en"


Try the googlenlp package in your browser

Any scripts or data that you put into this service are public.

googlenlp documentation built on May 2, 2019, 9:18 a.m.