clean_tweets: Clean (and augment) 'rtweet' data.frame

Description Usage Arguments Details Value See Also

Description

Munge an rtweet data.frame (according to personal preferences) for subsequent analysis.

Usage

1
2
3
4
5
6
7
8
clean_tweets_at(data = NULL, facet = NULL, trim = TRUE,
  cols = c("status_id", "created_at", "user_id", "screen_name", "text",
  "display_text_width", "reply_to_status_id", "is_quote", "is_retweet",
  "favorite_count", "retweet_count", "hashtags", "symbols", "urls_url",
  "urls_expanded_url", "media_expanded_url", "ext_media_expanded_url"),
  timezone = "America/Chicago")

clean_tweets(..., facet)

Arguments

data

data.frame (created using rtweet package).

facet

bare for NSE; character for SE. Name of column in data used for facetting. Set to NULL as default even though it is not required in order to simplify internal code. Included in cols if trim = TRUE.

trim

logical. Indicates whether or not to select only certain columns (and drop the others).

cols

character (vector). Name(s) of column(s) in data to keep. Only relevant if trim = TRUE.

timezone

character. Passed directly to lubridate::with_tz() as tzone parameter.

...

dots. Additional paramaters.

Details

Converts nested lists to character(s). Adds a timestamp column that is derived from the created_at column. Also, adds a time column that represents the hour in the day of created_at.

Value

data.frame.

See Also

https://juliasilge.com/blog/ten-thousand-data/. https://buzzfeednews.github.io/2018-01-trump-twitter-wars/.


tonyelhabr/tetext documentation built on May 14, 2019, 8:03 a.m.