RobotParser: RobotParser fetch and parse robots.txt

Description Usage Arguments Value Examples

Description

This function fetch and parse robots.txt file of the website which is specified in the first argument and return the list of correspending rules .

Usage

1
RobotParser(website, useragent)

Arguments

website

character, url of the website which rules have to be extracted .

useragent

character, the useragent of the crawler

Value

return a list of three elements, the first is a character vector of Disallowed directories, the third is a Boolean value which is TRUE if the user agent of the crawler is blocked.

Examples

1
2
#RobotParser("http://www.glofile.com","AgentX")
#Return robot.txt rules and check whether AgentX is blocked or not.

Rcrawler documentation built on May 2, 2019, 3:42 a.m.