robotstxt: A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.

Package details

AuthorPeter Meissner [aut, cre], Kun Ren [aut, cph] (Author and copyright holder of list_merge.R.), Oliver Keys [ctb] (original release code review), Rich Fitz John [ctb] (original release code review)
MaintainerPeter Meissner <retep.meissner@gmail.com>
LicenseMIT + file LICENSE
Version0.7.13
URL https://docs.ropensci.org/robotstxt/ https://github.com/ropensci/robotstxt
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:
install.packages("robotstxt")

Try the robotstxt package in your browser

Any scripts or data that you put into this service are public.

robotstxt documentation built on Sept. 4, 2020, 1:08 a.m.