crawl: Initial Crawl

Description Usage Arguments

Description

Make the initial data crawl.

Usage

1
2
3
crawl_data(days = 30L, quiet = FALSE, pages = 50L, append = TRUE,
  apply_segments = TRUE, since_last = TRUE, pause = 5,
  overwrite = FALSE, ...)

Arguments

days

Number of days from today (Sys.Date()) to craw articles.

quiet

Whether to print helpful messages in the console (default recommended), passed to wh_news.

pages

Number of pages of data to crawl, defaults to 3L to crawl all 3 pages of data, set to Inf (infinite) to collect all data available under your plan (at your own rirks). 1 page of results = 1 query = 100 articles (you have 1,000 free queries per month).

append

If data has been previously crawled and stored, whether to append new results to it (set to TRUE).

apply_segments

If TRUE applies the segments from _auritus.yml.

since_last

If TRUE crawls data since the most recently crawled article in dataset (recommended). Only applies if append is TRUE (and data already exists).

pause

Time in seconds, to wait before crawling.

overwrite

Set to TRUE to overwrite the database.

...

Any other parameter to pass to wh_news.


news-r/auritus documentation built on March 14, 2020, 12:50 p.m.