clean_tweets: Clean (and augment) 'rtweet' data.frame
In tonyelhabr/tetext: Tony's Personal Package for Text Analysis

Description Usage Arguments Details Value See Also

Munge an rtweet data.frame (according to personal preferences) for subsequent analysis.

clean_tweets_at(data = NULL, facet = NULL, trim = TRUE,
  cols = c("status_id", "created_at", "user_id", "screen_name", "text",
  "display_text_width", "reply_to_status_id", "is_quote", "is_retweet",
  "favorite_count", "retweet_count", "hashtags", "symbols", "urls_url",
  "urls_expanded_url", "media_expanded_url", "ext_media_expanded_url"),
  timezone = "America/Chicago")

clean_tweets(..., facet)

`data`	data.frame (created using `rtweet` package).
`facet`	bare for NSE; character for SE. Name of column in `data` used for facetting. Set to NULL as default even though it is not required in order to simplify internal code. Included in `cols` if `trim = TRUE`.
`trim`	logical. Indicates whether or not to select only certain columns (and drop the others).
`cols`	character (vector). Name(s) of column(s) in `data` to keep. Only relevant if `trim = TRUE`.
`timezone`	character. Passed directly to `lubridate::with_tz()` as `tzone` parameter.
`...`	dots. Additional paramaters.

Converts nested lists to character(s). Adds a timestamp column that is derived from the created_at column. Also, adds a time column that represents the hour in the day of created_at.

data.frame.

https://juliasilge.com/blog/ten-thousand-data/. https://buzzfeednews.github.io/2018-01-trump-twitter-wars/.

tonyelhabr/tetext documentation built on May 14, 2019, 8:03 a.m.