bind_tweets: Bind information stored as JSON files
In cjbarrie/academictwitteR: Access the Twitter Academic Research Product Track V2 API Endpoint

bind_tweets

R Documentation

Bind information stored as JSON files

Description

This function binds information stored as JSON files. The experimental function convert_json converts individual JSON files into either "raw" or "tidy" format.

Usage

bind_tweets(
  data_path,
  user = FALSE,
  verbose = TRUE,
  output_format = NA,
  vars = c("text", "user", "tweet_metrics", "user_metrics", "hashtags", "ext_urls",
    "mentions", "annotations", "context_annotations"),
  quoted_variables = FALSE
)

convert_json(
  data_file,
  output_format = "tidy",
  vars = c("text", "user", "tweet_metrics", "user_metrics", "hashtags", "ext_urls",
    "mentions", "annotations", "context_annotations"),
  quoted_variables = F
)

Arguments

`data_path`	string, file path to directory of stored tweets data saved as data_id.json and users_id.json
`user`	If `FALSE`, this function binds JSON files into a data frame containing tweets; data frame containing user information otherwise. Ignore if `output_format` is not NA
`verbose`	If `FALSE`, messages are suppressed
`output_format`	string, if it is not NA, this function return an unprocessed data.frame containing either tweets or user information. Currently, this function supports the following format(s) "raw"List of data frames; Note: not all data frames are in Boyce-Codd 3rd Normal Form "tidy"Tidy format; all essential columns are available "tidy2"Tidy format; additional variables (see vars) are available. Untruncates retweet text and adds indicators for retweets, quotes and replies. Automatically drops duplicated tweets. Handling of quoted tweets can be specified (see quoted_variables)
`vars`	vector of strings, determining the variables provided by the tidy2 format. Can be any (or all) of the following: "text"Text of the tweet, including language classification, indicator of sensitive content and (if applicable) sourcetweet text "user"Information on the user in addition to their ID "tweet_metrics"Tweet metrics, specifically the like, retweet and quote counts "user_metrics"User metrics, specifically their tweet, list, follower and following counts "hashtags"Hashtags contained in the tweet. Untrunctated for retweets "ext_urls"Shortened and expanded URLs contained in the tweet, excluding those internal to Twitter (e.g. retweet URLs). Includes additional data provided by Twitter, such as the unwound URL, their title and description (if available). Untrunctated for retweets "mentions"Mentioned usernames and their IDs, excluding retweeted users. Untrunctated for retweets. Note that quoted users are only mentioned here if explicitly named in the tweet text. This was usually the case with older versions of Twitter, but is no longer the standard behaviour. Extracting mentions allows the usernames of the RT authors (rather than only their ID) to be preserved "annotations"Annotations provided by Twitter, including their probability and type. Basically Named Entities. See https://developer.twitter.com/en/docs/twitter-api/annotations/overview for details "context_annotations"Context annotations provided by Twitter, including additional data on their domains. See https://developer.twitter.com/en/docs/twitter-api/annotations/overview for details
`quoted_variables`	Should additional vars be returned for the quoted tweet? Defaults to FALSE. TRUE returns additional "_quoted" var-columns containing the vars (mentions, hashtags, etc.) of the quoted tweet in addition to the actual tweet's data
`data_file`	string, a single file path to a JSON file; or a vector of file paths to JSON files of stored tweets data saved as data_id.json

Details

By default, bind_tweets binds into a data frame containing tweets (from data_id.json files).

If users is TRUE, it binds into a data frame containing user information (from users_id.json).

For the "tidy" and "tidy2" format, parallel processing with furrr is supported. In order to enable parallel processing, workers need to be set manually through future::plan(). See examples

Note that output of the tidy2 vars returns results of the Twitter API, rather than from tweet text. Therefore, certain variables, especially context annotations and quoted_variables, may not be present in older data.

Value

a data.frame containing either tweets or user information

Examples

## Not run: 
# bind json files in the directory "data" into a data frame containing tweets
bind_tweets(data_path = "data/")

# bind json files in the directory "data" into a data frame containing user information
bind_tweets(data_path = "data/", user = TRUE)

# bind json files in the directory "data" into a "tidy" data frame / tibble
bind_tweets(data_path = "data/", user = TRUE, output_format = "tidy")

# bind json files in the directory "data" into a "tidy2" data frame / tibble, get hashtags and
# URLs for both original and quoted tweets
bind_tweets(data_path = "data/", user = TRUE, output_format = "tidy2", 
            vars = c("hashtags", "ext_urls"),
            quoted_variables = T)
            
# bind json files in the directory "data" into a "tidy2" data frame / tibble with parallel computing
## set up a multisession
future::plan("multisession")
## run the function - note that no additional arguments are required
bind_tweets(data_path = "data/", user = TRUE, output_format = "tidy2")
## Shut down parallel workers
future::plan("sequential")            

## End(Not run)

cjbarrie/academictwitteR documentation built on June 9, 2025, 3:36 a.m.

cjbarrie/academictwitteR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cjbarrie/academictwitteR
Access the Twitter Academic Research Product Track V2 API Endpoint

bind_tweets: Bind information stored as JSON files
In cjbarrie/academictwitteR: Access the Twitter Academic Research Product Track V2 API Endpoint

Bind information stored as JSON files

Description

Usage

Arguments

Details

Value

Examples

Related to bind_tweets in cjbarrie/academictwitteR...

R Package Documentation

Browse R Packages

We want your feedback!

cjbarrie/academictwitteR Access the Twitter Academic Research Product Track V2 API Endpoint

bind_tweets: Bind information stored as JSON files In cjbarrie/academictwitteR: Access the Twitter Academic Research Product Track V2 API Endpoint

Bind information stored as JSON files

Description

Usage

Arguments

Details

Value

Examples

Related to bind_tweets in cjbarrie/academictwitteR...

R Package Documentation

Browse R Packages

We want your feedback!

cjbarrie/academictwitteR
Access the Twitter Academic Research Product Track V2 API Endpoint

bind_tweets: Bind information stored as JSON files
In cjbarrie/academictwitteR: Access the Twitter Academic Research Product Track V2 API Endpoint