get_user_tweet_ids: Get a user's tweet IDs for a given date range

Description Usage Arguments Value WARNING

View source: R/screen_scrape_tweets.R

Description

Given a start and end date, function looks for tweet ID files already to disk and gets tweet IDs for the remaining date range(s) by calling scrape_tweet_ids.

Usage

1
2
3
4
get_user_tweet_ids(screen.name, user.id, since, until, remdr,
  date.interval = "year", sleep = 2, .write.out = TRUE, .data.path,
  .file.stem = paste0("tw_user_", user.id, "_tweet_ids_%s"),
  verbose = TRUE, ...)

Arguments

screen.name

is the screen (or account) name of a twitter user

user.id

is the user ID of a twitter user

since

a date (format '%Y-%m-%d'), specifying the start of the date range to be requested

until

a date (format '%Y-%m-%d'), specifying the end of the date range to be requested

remdr

an active RSelenium remoteDriver object (check remdr$getStatus() to see if the driver is running.)

date.interval

date interval passed to 'by' argument of seq.Date. Defaults to 'year'.

sleep

Seconds to pause between non-adjacent date ranges. Defautls to 2.

.write.out

logical. write out tweet IDs as JSON to disk? If TRUE (the default), JSON file will be written to path .data.path

.data.path

Path to look at for existing tweet ID files Also the path where new ID files are written if write.out = TRUE.

.file.stem

JSON file name stem (stem ignores date ranges) Defaults to glob 'tw_user_<tw.account>_tweet_ids_*.json'. Used both for looking for existing tweet ID JSON files, and to name new ones when writing to disk.

verbose

logical. Print out status messages?

...

further arguments passed to scrape_tweet_ids.

Value

A tibble data frame. The data frame is empty if an error occurs or no tweet IDs were scraped in the given time range. Otherwise it has columns

  1. 'screen_name' (<chr>, as passed to argument screen.name),

  2. 'since' (<date>, as returned by interval-specific calls to scrape_tweet_ids),

  3. 'until' (<date>, as returned by interval-specific calls to scrape_tweet_ids),

  4. 'tweet_id' (<chr>) and

  5. 'user_id' (<chr>, as passed to argument user.id)

, and one row is one tweet.

WARNING


haukelicht/twscrape documentation built on Jan. 29, 2020, 3:23 p.m.