download_page: Download a webpage
In tjtnew/coffee: Stuff that helps me get going

Description Usage Arguments Details Value References

View source: R/download_page.R

Downloads a single webpage using wget and some hardcoded options.

download_page(
  url,
  destination = Sys.getenv("WEBPAGE_DIR"),
  history = Sys.getenv("WEBPAGE_DOWNLOAD_HISTORY"),
  clobber = FALSE
)

get_dl_history(history = Sys.getenv("WEBPAGE_DOWNLOAD_HISTORY"))

`url`	Webpage to download with.
`destination`	Folder to store webpage.
`history`	Where to save the download history csv.
`clobber`	Should files be overwritten if already downloaded.

download_page() calls wget with the following options:

* adjust-extension:  save HTML/CSS documents with proper extensions
* span-hosts:        go to foreign hosts when recursive
* convert-links:     make links in downloaded HTML or CSS point to local files
* backup-converted:  before converting file X, back up as X.orig
* page-requisites:   get all images, etc. needed to display HTML page