search_and_store_tweets: Retrieve and store all Tweets for a given full archive search...

Description Usage Arguments Details

View source: R/search-tweets.R

Description

search_and_store_tweets allows to execute long-running Tweet searches against the /2/tweets/search/all endpoint in the Twitter Academic Research product track Twitter Academic Research product track and store the results as batches of files containing the original JSON responses.

Usage

1
2
3
search_and_store_tweets(queryString, fromDate = NULL, toDate = NULL,
  batchBaseLabel = NULL, maxBatchSize = 10000,
  twitterBearerToken = NULL, verbose = TRUE)

Arguments

queryString

a character string specifying a value for the Tweet search query parameter (e.g. "sustainability (climate change)", "from:stefandaume" etc). See here for details. Maximum of 1024 characters.

fromDate

a character string of format ("YYYY-MM-DD") specifying the date for the oldest Tweets to be included in the search results (interpreted as inclusive); corresponds to the start_time parameter of the /2/tweets/search/all endpoint. Must NOT be a date before "2006-03-21". If no value is supplied (default), it is interpreted as the 30th day before toDate.

toDate

a character string of format ("YYYY-MM-DD") specifying the date for the most recent Tweets to be included in the search results (interpreted as inclusive); corresponds to the end_time parameter of the /2/tweets/search/all endpoint. If no value is supplied (default), it is interpreted as the current date.

batchBaseLabel

a character string used to name the stored Tweet search batch files.

maxBatchSize

an integer specifying the approximate number of Tweets in each stored batch of Tweet search responses. This number serves as a threshold, once more Tweets than maxBatchSize are accumulated, they are stored as a batch.

twitterBearerToken

a character string specifying a valid bearer token for the Twitter Academic Research product track (see oauth_twitter_token()).

verbose

a boolean indicating whether more detailed intermediate progress messages should be printed to the console; if FALSE only a progress bar based on the specified or implied date range and the date of the latest retrieved Tweet will be printed

Details

Based on a given Tweet search query and (optional) date range this function iteratively retrieves all matching Tweets and stores the results in batches as appropriately labelled JSON files; the complete set of results is split in batches containing the (approximate) number of Tweets specified by maxBatchSize Tweets.

The calls to the search API endpoint are timed such that the API call limit of at most one call per second and 300 calls per 15 minute window is observed; this corresponds to a maximum of 150.000 Tweets that can be retrieved every 15 minutes (each individual API call returns at most 500 Tweets).

When running a query a progress bar in the console indicates how quickly data collection is advancing; progress is shown in relation to the (explicitly or implicitly) specified search time range and dates of the Tweets in the retrieved Tweet batches.


sdaume/twittrcademic documentation built on Dec. 22, 2021, 11:11 p.m.