scrape_thread_content: Scrape thread
In fellennert/flashbackscrapR: scraping flashback.org

Description Usage Arguments Value Examples

View source: R/scrape_thread_content.R

Scrapes a certain thread

scrape_thread_content(
  suffix,
  export_csv = FALSE,
  folder_name = NULL,
  file_name = NULL,
  delay = TRUE
)

`suffix`	A character string containing a thread's suffix (which can be obtained using `get_thread_links()`). Suffixes need to start with `/`.
`export_csv`	A logical vector. Defaults to `FALSE`. The function can automatically save the output in a csv file. If `export_csv = TRUE` , a csv file is exported. The output folder can be specified using the `folder` argument.
`folder_name`	A character string which specifies the name of the folder the output should be saved in. The folder's name is added to the path of the current working directory which can be obtained using `getwd()` and modified with `setwd()`. If nothing is specified and `export_csv = TRUE`, the function will export the csv file straight into the working directory.
`file_name`	A character string which specifies the name of the output file. It is not necessary to add '.csv'. If no file name is provided, `file_name` defaults to `scrape_[YYYY-MM-DD].csv`.
`delay`	A logical vector, defaults to `TRUE`. flashback.org's robots.txt-file asks for putting a five second delay between each iteration. You can deliberately ignore this by setting `delay = FALSE`. Note that THIS IS NOT RECOMMENDED!

A tibble with the following columns: url contains the thread's URL suffix, date the date the posting was made on, time the time the posting was made at, author_name the respective author's user name, author_url the link to their profile (can be scraped using scrape_user_profile()), quoted_user the user name of the user that is quoted in a posting (NA if the posting does not contain a quote), posting the posting *as is*, i.e., with potential quotes, posting_wo_quote the posting with all quotes removed.