Man pages for barob1n/crawlR
Async Web crawler for R.

crawlR	CrawlR - Async Web Crawler for R
create_fetch_list	Creates a Fetch List
extract_links	extract links
extract_meta	extract meta tags
extract_tags	extract html tags
extract_tags_xml2	extract html tags
fetchR	Fetch a List of Url's.
fetchR_parseR	Fetch a List of Url's.
fetchR_parseR_edit	Fetch a List of Url's.
find_last_dir	Get Last Directory
generateR	Generate fetch list of Url's from crawlDB
get_links	Extract Links Found on Webpage.
injectR	Inject seeds into crawlDB
load_batch	Queue a Batch of URL's
makeHash	Convert String to hash
normalize_url	Normalize Url's
parse_content	General Parser for HTML
parse_content_fetched	General Parser
parseExt	Get file extension from Content-Type
parseR	Parse Processor
parseR_old	Parse Processor
parser_wrapper	Handles extracting links and applying supplied parse...
score_urls	Score urls
set_log_file	Log Out
tika_mimetype	Mimetypes from tika
updateR	Update crawlDB
write_log	Write to log
writeR	Base Output Writer (depricated)