extract.retweets: Connect to Mongo database and extract retweets that match...

Description Usage Arguments Details Author(s) Examples

Description

extract.tweets opens a connection to the Mongo database in the lab computer and will return all retweets, or only retweets that mention a specific keyword. In combination with summary.retweets, this is a quick way to display the most retweeted tweets over a certain period of time.

Usage

1
2
extract.retweets(set, string = NULL, min = 10, from = NULL, to = NULL,
  verbose = TRUE)

Arguments

set

string, name of the collection of tweets in the Mongo database to query.

string

string or vector of strings, set to NULL by default (will return all retweets). If it is a string, it will return retweets that contain that string. If it is a vector of string, it will return all tweets that contain at least one of them.

min

numeric, set to 10 by default (will return all retweets whose retweet count is at least 10). In large datasets, choose a high number to increase speed of query.

from

date, in string format. If different from NULL, will consider only tweets after that date. Note that using this field requires that the tweets have a field in ISODate format called timestamp. All times are GMT.

to

date, in string format. If different from NULL, will consider only tweets after that date. Note that using this field requires that the tweets have a field in ISODate format called timestamp. All times are GMT.

verbose

logical, default is TRUE, which generates some output to the R console with information about the count of tweets.

Details

Note that this function will only return retweets that are made using the built-in retweeting system - this is, 'manual' retweets using copy&paste are not included. Also note that total retweet counts are based on Twitter's internal tally, and do not reflect the number of retweets in the database. In other words, it could happen that the most popular retweet in a given moment is a tweet that was originally sent days ago, but was retweeted during the time of that tweets were captured.

Author(s)

Pablo Barbera pablo.barbera@nyu.edu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
## connect to the Mongo database
 mongo <- mongo.create("SMAPP_HOST:PORT", db="DATABASE")
 mongo.authenticate(mongo, username="USERNAME", password="PASSWORD", db="DATABASE")
 set <- "DATABASE.COLLECTION"

## extract all retweets that were retweeted at least 2000 times
 rts <- extract.retweets(set, min=2000)

## show top 10 retweets from previous query
 summary(rts, n=10)

## extract all retweets that mentioned "turkey" and were retweeted at least 100 times
 rts <- extract.retweets(set, string="occupygezi", min=100)

## show top 10 retweets from previous query
 summary(rts, n=10)

## End(Not run)

SMAPPNYU/smappR documentation built on May 9, 2019, 11:19 a.m.