scrape_thread: Scrape thread Returns the threads' content

Description Usage Arguments Value Examples

View source: R/scrape_thread.R

Description

Scrape thread Returns the threads' content

Usage

1
scrape_thread(suffix, quotes = TRUE)

Arguments

quotes

A logical vector indicating how the function should go across quotes. If set to TRUE, the default value, they are kept and a column is added with an indicator whether the posting contains a quote or not. If set to FALSE, the function will try to remove them. This is successful in the majority of cases. Sometimes, however, it fails – seldom leading the function to crash. Hence, if quotes = FALSE, it is advised to use some sort of "safety net" like, for instance, purrr::safely().

thread_link

A character string. The thread's link.

Value

A tibble with a bunch of columns. thread contains the url where you can find the thread containing the posting, date the date the posting was created on, time the time it was created at, content its textual content, and quote_ind indicates whether it contains quoted content or not. Unfortunately, it is nearly impossible to remove the quotes in a reasonable manner. If the function is successful, postings without quotes (either because they did not contain one in the first place or because the function worked properly) can be found in content_wo_quote. If the function was not successful, the entries where it failed at are also in this column – probably still contaning the citation – and devtoolsstart with "!!!flawed citation!!!".

Examples

1
scrape_thread("/Forum-27-260/m49908859.html", quote = FALSE)

fellennert/familjelivscrapR documentation built on Oct. 4, 2020, 1:35 p.m.