R/data.R

#' Tweets
#'
#' This data set contains a sample of Twitter status updates (i.e. tweets)
#' collected between Thursday, September 26, 2019 and ... using the Twitter
#' Search API via the \code{rtweets} package.
#'
#' The data set is a result of a series of searches performed, once per day
#' during the timeframe noted previously, using the term "#whistleblower" and
#' a target number of 10,000 tweets per day in English excluding retweets.
#' The data set includes the following variables.  Variables used in this package
#' are highlighted in bold.  The Twitter data dictionary can be found in the
#' \link[online developer guides]{https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/intro-to-tweet-json}.
#'
#' \describe{
#' \item{\code{user_id}}{Integer; unique user identifier.}
#' \item{\code{status_id}}{Integer; unique tweet identifier.}
#' \item{\code{created_at}}{UTC time when the tweet was created.}
#' \item{\code{text}}{The actual UTF-8 text of the tweet.}
#' \item{\code{created_at_date}}{The date when the tweet was created.}
#' \item{\code{created_at_weekday}}{Integer; day of week the tweet was created.}
#' \item{\code{created_at_hour}}{Integer; hour of day the tweet was created.}
#' \item{\code{hashtags}}{Character string containing hashtags which have
#' been parsed out of the tweet text.}
#' }
#'
#' @docType data
#'
#' @keywords dataset
#'
#' @name tweets
#'
#' @usage data(tweets)
#'
#' @format A data frame with ... observations and 8 variables.
#'
#' @author Donnie Minnick \email{donnie.minnick@gmail.com}
#'
#' @references \url{https://developer.twitter.com/en/docs/tweets/search/overview}
#'

"tweets"

#' Users
#'
#' This data set contains user data from a sample of Twitter status updates
#' (i.e. tweets) collected between Thursday, September 26, 2019 and ... using
#' the Twitter Search API via the \code{rtweets} package.  Refer to documentation
#' of the \code{tweets} dataset for a description of the method of data
#' collection.
#'
#' The data set includes the following variables.  The Twitter data dictionary
#' can be found in the \link[online developer guides]{https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/intro-to-tweet-json}.
#'
#' \describe{
#' \item{\code{user_id}}{Integer, unique user identifier.}
#' \item{\code{screen_name}}{Screen name, handle, or alias that this user
#' identifies themselves with. screen names are unique but subject to change.}
#' \item{\code{name}}{Name of the user, as they’ve defined it. Not necessarily
#' a person’s name. Typically capped at 50 characters, but subject to change.}
#' \item{\code{location}}{The user-defined location for this account’s
#' profile. Not necessarily a location, nor machine-parseable.}
#' \item{\code{description}}{The user-defined UTF-8 string describing account.}
#' \item{\code{followers_count}}{Number of followers this account
#' currently has. Under certain conditions of duress, this field will
#' temporarily indicate 0.}
#' \item{\code{friends_count}}{The number of users this account is
#' following. Under certain conditions of duress, this field will temporarily
#' indicate 0.}
#' \item{\code{favourites_count}}{The number of Tweets this user has liked
#' in the account’s lifetime. British spelling used in the field name for
#' historical reasons.}
#' \item{\code{account_created_at}}{The UTC datetime that the user account
#' was created on Twitter.}
#' \item{\code{ff_percentage}}{Ratio of followers to the user's total followers
#' and friends.  Intended to identify Twitter bot accounts.}
#' \item{\code{account_age_in_years}}{Integer, age of account in years,
#' calculated by subtracting the date account was created from the current
#' system date.}
#' \item{\code{account_created_at}}{Date account was created.}
#' \item{\code{state_code}}{US state code, if contained in the user's account
#' location.}
#' \item{\code{state_name}}{US state name, if contained in the user's account
#' location.}
#' \item{\code{tweet_count}}{Integer, number of sample tweets for this user.}
#' }
#'
#' @docType data
#'
#' @keywords dataset
#'
#' @name users
#'
#' @usage data(users)
#'
#' @format A data frame with ... observations and 15 variables.
#'
#' @author Donnie Minnick \email{donnie.minnick@gmail.com}
#'
#' @references \url{https://developer.twitter.com/en/docs/tweets/search/overview}
#'

"users"
dtminnick/whistleblower documentation built on Nov. 14, 2019, 2:45 p.m.