get_full_thread_links: Get entire main section's links

Description Usage Arguments Value Examples

View source: R/full_thread_links.R

Description

Returns a file that allows for scraping the entire section, folder names mimic the section structure

Usage

1
2
3
4
5
6
7
8
9
get_full_thread_links(
  suffix,
  path,
  cut_off = "2000-01-01",
  delay = TRUE,
  export_links = FALSE,
  export_meta = TRUE,
  output_folder = ""
)

Arguments

suffix

The section's suffix

cut_off

A character string containing the date at which the latest post in the thread should had been posted on. Has to be in the format YYYY-MM-DD. Defaults to "2000-01-01".

delay

flashback.org's robots.txt-file asks for putting a five second delay between each iteration. You can deliberately ignore this by setting delay = FALSE. Note that THIS IS NOT RECOMMENDED!

export_links

If set to TRUE, a CSV file containing the links is exported

export_meta

If set to TRUE, a CSV file containing data on the scrape is exported

output_folder

A character string determining the folder the CSV files containing the links and the meta data should be stored in.

folder_name

A character vector with a folder name the scraped files are supposed to be stored in

Value

A tibble with the name of the sub(sub) section's suffix sub_suffix, the name of the folder the scraped thread should be stored at folder_name, the thread links thread_links, and the prospective file name file_name

Examples

1
2
get_full_section_links(suffix = "/f102", folder_name = NULL,
  cut_off = "2020-10-25", delay = TRUE)

fellennert/flashbackscrapR documentation built on Sept. 10, 2021, 4:15 p.m.