Get all the tweets from the Twitter Standard Search API json files and the geolocated tweets json files obtained by calling (
geotag_tweets) and store the results in the series folder as daily Rds files
1 2 3 4
List of series to aggregate, default: list("country_counts", "geolocated", "topwords")
Current tasks for reporting purposes, default: get_tasks()
This function will write new aggregated series by launching a SPARK task of aggregating data collected from the Twitter Search API and geolocated from geotag tweets. By doing the following steps: - Identify the last aggregates date by looking into the series folder
- Look for date range of tweets collected since that day by looking at the stat json files produced by the search loop
- For each day that has to be updated a list of all geolocated and search files to load will be produced by looking at the stat files
- For each series passed as a parameter and for each date to update:
- a Spark task will be called that will deduplicate tweets for each topic, join them with geolocation information, and aggregate them to the required level and return to the standard output as json lines
- the result of this task is parsed using jsonlite and saved into RDS files in the series folder
A prerequisite to this function is that the
search_loop must have already collected tweets in the search folder and that geotag_tweets has already run.
Normally this function is not called directly by the user but from the
the list of tasks updated with aggregate messages
1 2 3 4 5 6 7 8 9 10
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.