download_page: Download a webpage

Description Usage Arguments Details Value References

View source: R/download_page.R

Description

Downloads a single webpage using wget and some hardcoded options.

Usage

1
2
3
4
5
6
7
8
download_page(
  url,
  destination = Sys.getenv("WEBPAGE_DIR"),
  history = Sys.getenv("WEBPAGE_DOWNLOAD_HISTORY"),
  clobber = FALSE
)

get_dl_history(history = Sys.getenv("WEBPAGE_DOWNLOAD_HISTORY"))

Arguments

url

Webpage to download with.

destination

Folder to store webpage.

history

Where to save the download history csv.

clobber

Should files be overwritten if already downloaded.

Details

download_page() calls wget with the following options:

1
2
3
4
5
* adjust-extension:  save HTML/CSS documents with proper extensions
* span-hosts:        go to foreign hosts when recursive
* convert-links:     make links in downloaded HTML or CSS point to local files
* backup-converted:  before converting file X, back up as X.orig
* page-requisites:   get all images, etc. needed to display HTML page

Value

References

https://www.gnu.org/software/wget/manual/wget.html#Recursive-Retrieval-Options


tjtnew/coffee documentation built on Dec. 23, 2021, 11 a.m.