scrape: Scrape data from web pages

Description Usage Arguments

View source: R/scrape.R

Description

Take a vector of URLs and scrape data from the associated web pages, using quickscrape

Usage

1
2
scrape(urls, url_file = NULL, scraper = "generic_open", ratelimit = 3,
  outdir = NULL, results = "both", args = list())

Arguments

urls

A vector of URLs to scrape

url_file

Alternatively, a file containing a list or URLs separated by newlines

scraper

A single scraper to use, or list of scrapers the same length as urls. If NULL, scrape will choose scrapers based on the domains of URLs. The scraper can either be one of the included scrapers (found with names(quickscraper:::package_scrapers)), the path to a scraperJSON file, or a scraperJSON file converted to an R list.

ratelimit

The minimum time between scraping pages.

list

The form to return results in, either "list", "data.frame", or "none" to only retain results as JSON files on disk

outdir

The directory to write results to. If NULL, files will be written to a temporary directory.

results

Save the downloaded results? If "load", scrape will return the results as a list. If "save", the results will be saved in outdir. If "both", both.


noamross/quickscraper documentation built on May 23, 2019, 9:30 p.m.