can_fetch: Test URL paths against a 'robxp' 'robots.txt' object

View source: R/can-fetch.r

can_fetchR Documentation

Test URL paths against a robxp robots.txt object

Description

Provide a character vector of URL paths plus optional user agent and this function will return a logical vector indicating whether you have permission to fetch the content at the respective path.

Usage

can_fetch(obj, path = "/", user_agent = "*")

Arguments

obj

robxp object

path

path to test

user_agent

user agent to test

Value

logical vector indicating whether you have permission to fetch the content

Examples

gh <- paste0(readLines(system.file("extdata", "github-robots.txt",
             package="spiderbar")), collapse="\n")
gh_rt <- robxp(gh)

can_fetch(gh_rt, "/humans.txt", "*") # TRUE
can_fetch(gh_rt, "/login", "*") # FALSE
can_fetch(gh_rt, "/oembed", "CCBot") # FALSE

can_fetch(gh_rt, c("/humans.txt", "/login", "/oembed"))

spiderbar documentation built on Feb. 16, 2023, 7:39 p.m.