spiderbar: Parse and Test Robots Exclusion Protocol Files and Rules

The 'Robots Exclusion Protocol' <https://www.robotstxt.org/orig.html> documents a set of standards for allowing or excluding robot/spider crawling of different areas of site content. Tools are provided which wrap The 'rep-cpp' <https://github.com/seomoz/rep-cpp> C++ library for processing these 'robots.txt' files.

Getting started

Package details

AuthorBob Rudis (bob@rud.is) [aut, cre], SEOmoz, Inc [aut]
MaintainerBob Rudis <bob@rud.is>
LicenseMIT + file LICENSE
URL https://gitlab.com/hrbrmstr/spiderbar
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the spiderbar package in your browser

Any scripts or data that you put into this service are public.

spiderbar documentation built on May 16, 2021, 9:06 a.m.