'Crawler' Permissions Checker

as.list.robotstxt_text	Convert robotstxt_text to list
fix_url	Add http protocal if missing from URL
get_robotstxt	Download a robots.txt file
get_robotstxt_http_get	Storage for HTTP request response objects
get_robotstxts	Download multiple robotstxt files
guess_domain	Guess a domain from path
http_domain_changed	Check if HTTP domain changed
http_subdomain_changed	Check if HTTP subdomain changed
http_was_redirected	Check if HTTP redirect occurred
is_suspect_robotstxt	Check if file is valid / parsable robots.txt file
is_valid_robotstxt	Validate if a file is valid / parsable robots.txt file
list_merge	Merge a number of named lists in sequential order
named_list	Create a named list
null_to_default	Return default value if NULL
parse_robotstxt	Parse a robots.txt file
parse_url	Parse a URL
paths_allowed	Check if a bot has permissions to access page(s)
paths_allowed_worker_spiderbar	Check if a spiderbar bot has permissions to access page(s)
pipe	re-export magrittr pipe operator
print.robotstxt	Print robotstxt
print.robotstxt_text	Print robotstxt's text
remove_domain	Remove domain from path
request_handler_handler	Handle robotstxt handlers
robotstxt	Generate a representation of a robots.txt file
rt_cache	Get the robotstxt cache
rt_get_comments	Extract comments from robots.txt
rt_get_fields	Extract permissions from robots.txt
rt_get_fields_worker	Extract robotstxt fields
rt_get_rtxt	Load robots.txt files saved along with the package
rt_get_useragent	Extract HTTP useragents from robots.txt
rt_list_rtxt	List robots.txt files saved along with the package
rt_request_handler	Handle robotstxt object retrieved from HTTP request
sanitize_path	Make paths uniform

ropenscilabs/robotstxt documentation built on Nov. 18, 2024, 8:14 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ropenscilabs/robotstxt
A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

Man pages for ropenscilabs/robotstxt
A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

R Package Documentation

Browse R Packages

We want your feedback!

ropenscilabs/robotstxt A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

Man pages for ropenscilabs/robotstxt A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

R Package Documentation

Browse R Packages

We want your feedback!

ropenscilabs/robotstxt
A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

Man pages for ropenscilabs/robotstxt
A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker