spiderbar: Parse and Test Robots Exclusion Protocol Files and Rules

spiderbarR Documentation

Parse and Test Robots Exclusion Protocol Files and Rules

Description

The 'Robots Exclusion Protocol' (https://www.robotstxt.org/orig.html) documents a set of standards for allowing or excluding robot/spider crawling of different areas of site content. Tools are provided which wrap The rep-cpp https://github.com/seomoz/rep-cpp C++ library for processing these 'robots.txt“ files.

Author(s)

Bob Rudis (bob@rud.is)


spiderbar documentation built on Feb. 16, 2023, 7:39 p.m.