Description Usage Arguments Details Author(s) Examples
extract.tweets
opens a connection to the Mongo database in
the lab computer and will return tweets that match a series of conditions:
whether it contains a certain keyword, whether it is or not a retweet,
or whether or not it contains a hashtag. It allows to specify the fields of
the tweet to be extracted. If desired, it can also return a fixed number of
tweets that will represent a random sample of all tweets in the database.
1 2 3 4 |
set |
string, name of the collection of tweets in the Mongo database to query. |
string |
string or vector of strings, set to NULL by default (will return all tweets). If it is a string, it will return tweets that contain that string. If it is a vector of string, it will return all tweets that contain at least one of them. |
size |
numeric, set to 0 by default (will return all tweets that match other conditions). If it between 0 and 1 (not included), it will return that proportion of tweets in the database (e.g. 0.5 implies 50% of all tweets that match other conditions will be returned). If it is 1 or greater, it will return a random sample of that size with tweets that match the specified conditions. |
fields |
vector of strings, indicates fields from tweets that will be returned. Default is the date and time of the tweet, its text, and the screen name of the user that published it. See details for full list of possible fields. |
retweets |
logical, set to NULL by default (will return all tweets).
If |
hashtags |
logical, set to NULL by default (will return all tweets).
If |
from |
date, in string format. If different from |
to |
date, in string format. If different from |
user_id |
vector of numeric IDs for users. If different form |
screen_name |
screen name of a user. If different form |
verbose |
logical, default is |
The following is a non-exhaustive of relevant fields that can be specified on the
fields
argument (for a complete list, check the documentation at:
https://dev.twitter.com/docs/platform-objects
Tweet: text, created_at, id_str, favorite_count, source, retweeted, r
retweet_count, lang, in_reply_to_status_id, in_reply_to_screen_name
Entities: entities.hashtags, entities.user_mentions, entities.hashtags, entities.urls
Retweeted_status: retweeted_status.text, retweeted_status.created_at... (and all
other tweet, user, and entities fields)
User: user.screen_name, user.id_str, user.geo_enabled, user.location,
user.followers_count, user.statuses_count, user.friends_count,
user.description, user.lang, user.name, user.url, user.created_at, user.time_zone
Geo: geo.coordinates
Pablo Barbera pablo.barbera@nyu.edu
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ## Not run:
## connect to the Mongo database
mongo <- mongo.create("SMAPP_HOST:PORT", db="DATABASE")
mongo.authenticate(mongo, username="USERNAME", password="PASSWORD", db="DATABASE")
set <- "DATABASE.COLLECTION"
## extract text from all tweets in the database
tweets <- extract.tweets(set, fields="text")
## extract random sample of 10% of tweets, with text and screen name
tweets <- extract.tweets(set, fields=c("user.screen_name", "text"), size=0.10)
## extract random sample of 100 tweets that are not retweets
tweets <- extract.tweets(set, size=100, retweets=FALSE)
## extract all tweets that mention turkey
tweets <- extract.tweets(set, string="turkey")
## extract all tweets that mention 'occupygezi' and do a quick plot
tweets <- extract.tweets(set, string="occupygezi", fields="created_at")
plot(tweets)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.