fetch_all_listpages: Fetch html of multiple geizhals category pages

Description Usage Arguments Value See Also Examples

Description

Given the url of a geizhals page listing all products within a specific category (i.e., not the generic page-wide search from the search bar, but the page showing all items within a category), the html code from this and the following pages are returned. Filters might be applied, only results corresponding to that filter will be returned. This list is meant to be processed by the parse_all_listpages function.

Usage

1
2
fetch_all_listpages(firstlistpageurl, max_pages = 10,
  delay_listpage = NA, domain = "https://geizhals.at")

Arguments

firstlistpageurl

Character vector of length 1 containing the url of a geizhals category page (listing all items of a selected category).

max_pages

Maximal number of pages to be scraped. Default is 10.

delay_listpage

Number of seconds to wait between fetching subsequent list pages.

domain

Character vector of length one specifying the domain. Defaults to "https://geizhals.at".

Value

A list of xml documents.

See Also

parse_all_listpages

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Not run: 
url_geizhals <- "https://geizhals.at/?cat=acam35"
listpagehtml_list <- fetch_all_listpages(url_geizhals, max_pages = 2)
parse_all_listpages(listpagehtml_list)

url_geizhals <- "https://geizhals.eu/?cat=acam35"
listpagehtml_list <- fetch_all_listpages(url_geizhals, max_pages = 2,
  delay_listpage = 1, domain = "https://www.geizhals.eu")
parse_all_listpages(listpagehtml_list, domain = "https://www.geizhals.eu")

## End(Not run)

ingonader/rgeizhals documentation built on May 29, 2019, 3:05 a.m.