parse_raw_tweets_to_cascades | R Documentation |
This function extracts cascades from a given jsonl file where each line is a tweet json object. Please refer to the Twitter developer documentation: https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object
parse_raw_tweets_to_cascades( paths, batch = 1e+05, cores = 1, output_path = NULL, keep_user = F, keep_absolute_time = F, keep_text = F, keep_retweet_count = F, progress = T, return_as_list = T, save_temp = F, keep_temp_files = T, api_version = 1 )
paths |
Full file paths to the tweets jsonl files |
batch |
Number of tweets to be read for processing at each iteration, choose the best number for your memory load. Defaults to at most 10000 tweets each iteration. |
cores |
Number of cores to be used for processing each batch in parallel. |
output_path |
If provided, the index.csv and data.csv files which define the cascaddes will be generated. In index.csv, each row is a cascade where events can be obtained from data.csv by corresponding indics (start_ind to end_ind). Defaults to NULL. |
keep_user |
Twitter user ids will be kept. |
keep_absolute_time |
Keep the absolute tweeting times. |
keep_text |
Keep the tweet text. |
keep_retweet_count |
Keep the retweet_count field. |
progress |
The progress will be reported if set to True (default) |
return_as_list |
If true then a list of cascades (data.frames) will be returned. |
save_temp |
If temporary files should be generated while processing. Processing can be resumed on failures. |
keep_temp_files |
If temporary files should be kept after index and data files generated. |
api_version |
Version of Twitter API used for collecting the tweets. |
If return_as_list is TRUE then a list of data.frames where each data.frame is a retweet cascade. Otherwise there will be no return.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.