Annotate Twitter images with imgrec
In imgrec: An Interface for Image Recognition

LOCAL <- identical(Sys.getenv("LOCAL"), "true")
knitr::opts_chunk$set(purl = LOCAL)

This short vignette demonstrates how to download and annotate images inside Tweets. Besides imgrec, we will use rtweet to get Twitter data and the dplyr package for data wrangling.

Setup

Before we start, access credentials are required for Twitter and Google Cloud Vision. For this example, all credentials are stored as R environment variables. Check out the rtweet authentication vignette to obtain and use Twitter API access tokens. The authentication for Google Cloud Vision is described in the imgrec intro vignette.

# load libraries
library(imgrec)
library(rtweet)
library(dplyr)

# prepare twitter credentials
app_name <- Sys.getenv('twitter_app_name')
consumer_key <-Sys.getenv('twitter_consumer_key')
consumer_secret <- Sys.getenv('twitter_consumer_secret')
access_token <- Sys.getenv('twitter_access_token')
access_token_secret <- Sys.getenv('twitter_access_secret')

# obtain twitter access token
token <- create_token(app = app_name, 
             consumer_key = consumer_key, 
             consumer_secret = consumer_secret, 
             access_token = access_token,
             access_secret = access_token_secret,
             set_renv = TRUE)

# setup authentification for google vision
gvision_init()

Download tweets

If you know the status id's of Tweets that you would like to obtain, you can use lookup_tweets(), which takes a vector of status id's as input and retrieves all corresponding tweets. URL's of images (and videos) are stored in the list column media_url.

We use one of the most-retweeted tweets posted by Barack Obama as an example:

"No one is born hating another person because of the color of his skin or his background or his religion..." (Barack Obama, Twitter Status)

example <- lookup_tweets(896523232098078720)
example$media_url

{width=50% }

Annotate Tweets

Now, we retrieve and parse annotations for the Tweet image:

results <- get_annotations(images = example$media_url[[1]], 
                           max_res = 10, # max. number of labels,
                           mode = "url", # we pass an image url
                           features = 'all') %>% 
           parse_annotations()

names(results) # features obtained by Google Cloud Vision