scrape_tweet_ids: Scrape Tweets IDs from screen
In haukelicht/twscrape: twscrape: Functionality to Collect Data from Twitter API and screen

Description Usage Arguments Details Value Scroll sleep WARNING

Given a twitter account screen name or ID, and start and end dates, function screen-scrapes IDs of historical tweets in time range and returns them in a data frame. Optionally, the scraped IDs can additionally be written to disk (if write.out = TRUE).

scrape_tweet_ids(tw.account, remdr, since.date, until.date,
  date.interval = "month", max.tweets.pi = 10000, write.out = TRUE,
  write.out.path,
  write.out.name = sprintf("tw_user_%s_tweet_ids_%s.json", tw.account,
  paste0(since.date, "_to_", until.date)), sleep = 0.5,
  .scroll.sleep = 0.75, verbose = TRUE)

`tw.account`	a scalar character vector, specifying a Twitter screen name or account ID
`remdr`	an active RSelenium `remoteDriver` object (check `remdr$getStatus()` to see if the driver is running.)
`since.date`	create date of oldest tweets to get Only accepts dates in format '%Y-%m-%d' (Year-month-day: 'YYYY-mm-dd')
`until.date`	create date of most recent (youngest) tweets to get Only accepts dates in format '%Y-%m-%d' (Year-month-day: 'YYYY-mm-dd')
`date.interval`	date interval passed to 'by' argument of `seq.Date`. Defaults to 'month'.
`max.tweets.pi`	maximum nuber of tweets per intevall to load. Defaults to 10'000. (See Dtails section)
`write.out`	logical. write out tweet IDs as JSON to disk? If `TRUE` (the default), JSON file will be written to path `write.out.path` and named `write.out.name`. If `FALSE`, `write.out.path` and `write.out.name` will be ignored.
`write.out.path`	Write out path (directory where to write scraped IDs file) Will be ignored if `write.out = FALSE`
`write.out.name`	JSON file name. Defaults to 'tw_user_<`tw.account`>_tweet_ids_<`since.date`>_to_<`until.date`>.json' Will be ignored if `write.out = FALSE`
`sleep`	Seconds to pause between date ranges when iterating over date intervals defined by `since.date`, `until.date` and `date.interval`. Defaults to .5 seconds
`.scroll.sleep`	Seconds to pause between scrolls when scrolling for more tweets. Defautls to .75 seconds. (See section 'scroll sleep' for details.)
`verbose`	logical. Print out status messages?

Note that the maximum number of tweets loaded per date interval (max.tweets.pi) needs to be adapted to the date interval. Per scroll, 20 new tweets are loaded. By default, there comes a pause of .75 seconds between scrolls. This means that at maximum, waiting for 10'000 tweets to load takes ((10000/20) * .75)/60 = 6.25 minutes.

A tibble data frame. The data frame is empty if an error occurs or no tweet IDs were scraped in the given time range. Otherwise it has columns 'account' (<chr>), 'since' (<date>), 'until' (<date>) and 'tweet_id' (<chr>), and one row is one tweet.

Argument .scroll.sleep determines how much the Twitter timeline has to fully load. WARNING: Setting low values (<.75 seconds) endangers not getting all tweet IDs, as the scraping process can be aborted prematurely due to too little scroll sleep. The default setting of .75 seconds is a minumum with fast internet connection.

Function presuposses an active remote Selenium driver.
Function only accepts dates in format '%Y-%m-%d' (Year-month-day: 'YYYY-mm-dd')

haukelicht/twscrape documentation built on Jan. 29, 2020, 3:23 p.m.

haukelicht/twscrape index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

haukelicht/twscrape
twscrape: Functionality to Collect Data from Twitter API and screen

scrape_tweet_ids: Scrape Tweets IDs from screen
In haukelicht/twscrape: twscrape: Functionality to Collect Data from Twitter API and screen

Description

Usage

Arguments

Details

Value

Scroll sleep

WARNING

Related to scrape_tweet_ids in haukelicht/twscrape...

R Package Documentation

Browse R Packages

We want your feedback!

haukelicht/twscrape twscrape: Functionality to Collect Data from Twitter API and screen

scrape_tweet_ids: Scrape Tweets IDs from screen In haukelicht/twscrape: twscrape: Functionality to Collect Data from Twitter API and screen

Description

Usage

Arguments

Details

Value

Scroll sleep

WARNING

Related to scrape_tweet_ids in haukelicht/twscrape...

R Package Documentation

Browse R Packages

We want your feedback!

haukelicht/twscrape
twscrape: Functionality to Collect Data from Twitter API and screen

scrape_tweet_ids: Scrape Tweets IDs from screen
In haukelicht/twscrape: twscrape: Functionality to Collect Data from Twitter API and screen