save_html: Download one results page as html file

View source: R/save_htmls.R

save_htmlR Documentation

Download one results page as html file

Description

Downloads one page of Google Scholar search results from a URLs as html files with a specific wait-time to avoid IP address blocking.

Usage

save_html(url, pause = 0.5, backoff = FALSE)

Arguments

url

One URLs corresponding to a page of search results.

pause

Integer specifying the number of seconds to wait between download attempts. The default value is 4 seconds.

backoff

A logical argument (TRUE or FALSE) specifying whether responsive backing-off should be used. If set to TRUE, the time between calls is varied depending on how long the server takes to respond to the original request. The responsive back-off time is set to multiple the response time by the 'pause' time: i.e. if the system takes 1.02 seconds to respond and 'pause' time is set to 4 seconds, a 4.10 second delay will be employed before the next call. The default for back-off is 'FALSE'.

Value

An HTML file is downloaded as a string object Pause and success messages are printed to the console.

Examples

## Not run: 
url <- 'https://scholar.google.co.uk/scholar?hl=en&as_sdt=0%2C5&q=testing&btnG='
html <- save_html(url, pause = 3, backoff = FALSE)

## End(Not run)

nealhaddaway/GSscraper documentation built on May 6, 2022, 10:52 a.m.