linked_urls: Linked Sites

Description Usage Arguments Details

View source: R/linked_urls.R

Description

Crawl a website, building a site map, and reporting all internal and external links found.

Usage

1
2
linked_urls(x, delay = 0.2, max_depth = 5, excludesites = "none",
  ...)

Arguments

x

The root url as a character string, or a html session.

delay

number of seconds to delay between http requests.

max_depth

Starting with the root url (level 0) follow links upto max_depth "clicks".

excludesites

(default is "none"

...

additional arguments (not yet used)

Details

The max_depth controls the number of links to follow. The root url is level 0 and all the hrefs found on that page are level 1. Each href on a level 1 page are labeled level 2. These labels and processing of the pages will continue through level max_depth. You could think of max_depth as the number of mouse clicks needed to navagate a web page by a human in a graphical web browser to the noted url or file.


jhollist/snaWeb documentation built on April 7, 2020, 12:49 a.m.